After years of taking a back seat to easier-to-crawl HTML pages, multimedia files are beginning to gain respect among search engines.
The spotlight this week turned to video, whether streaming or downloadable, as Yahoo Inc. late Wednesday posted an early test version of video search to its Yahoo Next site for public prototypes. Meanwhile, a much-smaller rival, Blinkx Inc., on Thursday unveiled a service that can transcribe video to make its contents searchable.
Both efforts demonstrate the growing importance of video and multimedia content on the Web as broadband has become more commonplace. Yahoos new service in particular is likely to propel its main search competitors, Google Inc. and Microsoft Corp.s MSN division, to more aggressively tackle multimedia search, said Gary Stein, a senior analyst at Jupiter Research.
“An increasing amount of the stuff on the Web is video, and if these [companies] are looking to index the Web, then why should they just be settling for text?” Stein said.
About half of the U.S. population, or 64.1 million Web users, connects to the Internet using broadband, according to Nielsen/NetRatings. Broadband growth also has led advertisers and media companies to increase their use of online video to reach consumers, Stein said.
America Online Inc. entered multimedia search late last year with its purchase of Singingfish Inc., one of the earliest startups focused on the segment. Under AOLs stead, Singingfish earlier this month began retooling its site as a search destination for audio and video clips.
Crawling multimedia content on the Web to make it searchable has posed tougher challenges for search engines than typical Web pages, said Bradley Horowitz, Yahoos director of media search. Compared with text-heavy Web pages, video files provide little context about their contents.
“Web pages are self-describing,” Horowitz said. “With video, where you bump into a video link its opaque, and you dont know whats inside the video.”
To discern context, Yahoo so far is analyzing the Web page text around a video link and the metadata included in a video file, such as its title and file type. Yahoo is not indexing the full contents of video with transcriptions, but Horowitz said the company is considering such an approach.
“We will be aggressive and use all means at our disposal to move video from opaque buckets of bits to make it something usable and that connects users to the content of video,” he said.
One part of that approach is the use of RSS (Really Simple Syndication), an XML syndication format. Yahoo supports RSS 2.0 for letting Web publishers to submit their video to Yahoos engine.
Yahoo has expanded on the idea of RSS enclosures, which let publishers include links to multimedia content and are commonly used for so-called podcasting, or Internet audio downloads.
Instead, Yahoo announced Media RSS, an extension to RSS 2.0 that lets publishers include links to streaming video and video files within a feed along with more descriptive information and even full transcripts, Horowitz said.
Along with using Media RSS to find new sources of video, Yahoo also plans in the future to crawl for Media RSS to include the feeds in its index.
Next Page: Yahoos path to multimedia search expertise.
Multimedia Search
Sunnyvale, Calif.-based Yahoo gained multimedia search expertise as part of its 2003 acquisition of Overture Services. Overture brought with it the AltaVista search engine, one of the first to incorporate video and audio search into its engine. The Yahoo Video Search project drew from AltaVistas experience but was its own project, Horowitz said.
“What you are seeing is fundamentally different from the [AltaVista] video search as it existed six weeks ago,” Horowitz said.
Yahoos video search engine supports such common media file types as AVIs, MPEGs, Windows Media, QuickTime and Real. Some Macromedia Flash is included, but Yahoo is working to fully support Flash, Horowitz said.
As Yahoo tests the video landscape, startup Blinkx is tackling full indexing of video. The San Francisco company launched Blinkx TV, a beta service that captures video streams from 22 channels, including the BBC, Fox News, ESPN and Biography, and uses speech recognition technology to make their content searchable. It also includes audio streams from National Public Radio.
By indexing more than the anchor text and metadata associated with video, Blinkx can take users directly to a video clip and to the portion of the clip that matches their search terms, Blinkx founder Suranga Chandratillake said.
Strict video search is more comparable to how people conduct Web and image searches today, but Blinkx envisions its approach as more akin to TV search, Chandratillake said.
“The reality of [the video search] approach is its pretty weak and gets to the Web site but not the best point on the Web site,” Chandratillake said. “With Blinkx, because were indexing the content and what people are saying on television, then you jump to the BBC or CNN clip.”
Blinkx TV is available through the Web as well as part of Blinkx 2.0, the companys desktop download that provides a client for entering searches and adds search toolbars to Windows applications.
Blinkx 2.0s “smart folders” feature for automatically populating a Windows folder with search results now supports video. Users choose to receive either links to relevant video streams or the file downloads within a smart folder, Chandratillake said.
Full indexing of video likely will become more important for all search engines, Stein said. But it wont overshadow the bigger need to make multimedia results match the intent of searchers.
“Video is not going to escape the core challenge of search, which is how relevant are the results and how deep are the results,” he said.
Beyond technology, search companies appear likely to partner more directly with the creators of video to make it more searchable. For Yahoo, RSS is only part of the strategy. Horowitz said the company has worked closely with media companies and publishers on the video search effort.
Blinkx as well has focused on releasing access to video where it has a relationship with broadcasters. For example, with CNN video content, users can find relevant clips but still must subscribe to its paid service to view full-length video.