SAN BRUNO, Calif. — YouTube is opening its online video storage banks to millions of new viewers by supplying captioning and subtitles for the benefit of deaf and hearing-disabled users — and people who simply speak a different language from the one used in a particular video.
The Google-owned online video storage and presentation provider announced that as of March 4, it has made available automatic captioning and subtitles in about 50 languages, with more to come over time, YouTube spokesman Chris Dale said.
Most, if not all, YouTube videos will have the application “soon,” Dale said.
“A core part of YouTube’s DNA is simply better access to content,” YouTube product team leader Hunter Walk said. “As we approach YouTube’s fifth birthday, we’ve been working since Day 1 to do this with all video.”
Videos are viewed 1 billion times per day, and an average of 20 hours of new videos are uploaded to YouTube every minute, Dale said. That’s a ton of content to caption, but YouTube and Google say the cloud service is up to the task.
The company is deploying Google’s voice-recognition software on all the YouTube site channels, with the application being run on Google’s servers in the cloud. No software will need to be downloaded by users.
Following more than five years of development, Google introduced the closed-captioning technology in beta form in November 2009. At the same time, it unveiled a new method to synchronize the captions to the video.
“Our goal is for anybody to go to any video at any time and be able to understand what’s being said clearly and cleanly, using our speech-recognition feature,” YouTube product manager Ken Harrenstein, who is hearing-disabled, told a group of journalists through a sign-language interpreter.
The feature will be indicated by a black “CC” (closed caption) button at the bottom right side of the video window. By clicking on the button, the button turns red and a window will come up that asks the viewer what language in which he or she wants to view the caption.
Not nearly perfected, but working on it
There are still plenty of bugs to be worked out. Speech recognition technology has been in development for more than 30 years, so it’s been a long, slow process to get it right.
“It [the translation] is not going to be perfect,” Harrenstein said. “Sometimes it will be funny, but it will work most of the time.”
For example, Harrenstein showed a video of a handheld device developer telling a group of software developers that “included in every device is a SIM card.” The translation came out as: “Included in every device is a salmon.”
Video producers who upload their films on YouTube will be able to review the machine-generated captions before posting them live, so that no mistakes get through. The texts of the captions also will be searchable.
“This is the way to make sure a product’s name is spelled right or that no other errors get into the captions,” Harrenstein said.
Harrenstein, an MIT graduate, explained how he often skipped lectures when he was in college because he couldn’t understand them [“I just saw mouths moving; I couldn’t learn anything”]. Now, because many lectures are videotaped, uploaded to YouTube, and now will be captioned, hearing-disabled students will be able to find the video, download the text of the captions, and search through the text for the key words and phrases they need for their lesson.
“That way, people don’t have to listen through hours of video just to find a 10-minute section that they really want to hear,” John Foliot of Stanford University’s online accessibility program said..
Go here to see a demo.
Home Applications