Updated Google Cloud Speech API Supports Longer Sound Bites | eWeek

Google Updates Cloud Speech API for Android App Developers

Speech API
Aug 14, 2017
2 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Google has added several new features to its Cloud Speech Application Programming Interface (API) for developers seeking to integrate speech recognition capabilities into their Android applications.

The updates add support for long-form audio clips and increase the number of languages for which speech recognition is now available.  The goal is to give developers more functionality and control for adding speech recognition to their products and services, Google product manager Dan Aharon said in an announcement on Google’s cloud platform blog.

Google’s Cloud Speech API is a machine-learning powered technology for converting speech to text. Developers and websites can use the API to enable capabilities like voice transcription, audio file transcription, voice-enabled command and control and call center routing in their applications and services.

Google has described the technology as being powered by machine-intelligence. The API uses deep learning neural network algorithms to improve its speech recognition capabilities with repeated use in the same setting. Developers can customize speech recognition to a particular setting or context by including specific phrases or words that might be spoken by users in that setting.

Google has described the API as being capable of streaming text results—or to make the text appear even as someone is speaking.  It can also be used to return text from saved audio files. Among the several other capabilities is one that lets developers filter out inappropriate content in spoken language or text.

Google currently offers its Cloud Speech API for free for the first 60-minutes of audio processed. After that the company charges $0.006 for every 15 seconds of processing.

This week’s updates extend the long-form audio support capabilities of the Cloud API. The length of supported audio files has been increased from up to 80 minutes to up to 180 minutes. The Cloud API can support files that are longer than three hours also, but only on a case-by-case basis, Aharon said.

Google has also introduced a word-level timestamp feature, which Aharon said was one of the features that developers had wanted the most in the Cloud Speech API.

The timestamps give users the ability to jump to specific points in a transcript where a piece of text might have been spoken. Or it can be used to display relevant text while the audio clip is playing, Aharon said. The feature can help organizations significantly cut down on the time needed to proofread transcripts and for improving the accuracy of speech-to-text transcription.

With this week’s update Google has also added support for 30 additional languages. The updated Cloud Speech API now supports 119 languages and their variants. Among the new languages that the API supports are Bengali, Latvian and Swahili. According to Aharon, the new language support covers an additional one billion speakers around the world.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.