Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Subscribe
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Subscribe
    Home Cloud
    • Cloud
    • Development

    Google Introduces AI-Powered Text-to-Speech for Many Application Types

    Written by

    Jaikumar Vijayan
    Published March 28, 2018
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      Google has launched a new technology that makes it easier for businesses to add natural sounding speech capabilities to their applications and services. 

      Cloud Text-to-Speech is available—currently in beta form—as an API that developers can use to enable voice interaction in a wide range of use cases. 

      Examples include powering interactive voice response systems in call centers, adding voice response capabilities to TVs, cars and internet of things devices and automatically converting news articles, books and other text-based media to audiobooks and podcasts. 

      Developers can choose from 32 different voices in 12 languages when adding voice capabilities to an application, service or device using Cloud Text-To-Speech.  

      Cloud-Text-To-Speech allows developers to customize attributes like speaking rate, pitch and volume gain, according to Dan Aharon, product manager of Cloud AI at Google. 

      The technology is designed to pronounce complex text such as names, dates and addresses correctly and authentically without any tweaking or customization, Aharon wrote in a blog announcing Cloud Text-To-Speech March 27. 

      Some of the high fidelity voices available with the new technology use WaveNet from DeepMind, a UK based artificial intelligence firm that Google acquired in 2014 and is now an Alphabet subsidiary. 

      WaveNet is a deep neural network for generating speech that mimics human voices. The speech generated with WaveNet is far more natural sounding than even the best Text-to-Speech systems, according to Google. 

      The technology is different from the most common current approach to generating speech with computers, which is by selecting and concatenating short speech fragments to make them whole utterances. 

      With concatenative text-to-speech technologies, a large database of speech fragments from a single speaker is first recorded and those fragments are then recombined as needed to make complete sentences, Google note. This approach makes it hard to modify the voice or alter the emotion or emphasis of the computer generated speech, according to Google. 

      WaveNet on the other hand is designed to produce raw audio waveforms by learning from large volumes of speech samples. “During training, the network extracts the underlying structure of the speech, for example which tones follow one another and what shape a realistic speech waveform should have,” Aharon said. 

      So when it is provided with a text input, a fully trained WaveNet model will be able to generate the corresponding speech waveform, much more accurately than other approaches to speech synthesis, he said. Current WaveNet models can generate up to 20 seconds of relatively high-quality audio in just 1 second. 

      Pricing for the Cloud Text-To-Speech API is based on the amount of text characters that are synthesized into audio. For speech that is synthesized without using WaveNet, Google won’t charge anything for the first 4 million characters each month and then $4 per 1 million characters after that. Enterprises that want WaveNet voices will get the first 1 million characters for free each month and then will have to pay $16 for each additional million characters. 

      Jaikumar Vijayan
      Jaikumar Vijayan
      Vijayan is an award-winning independent journalist and tech content creation specialist covering data security and privacy, business intelligence, big data and data analytics.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×