NEW YORK—One company has an idea for how search engines can catalog the Web more completely. Another believes it can better divine what a searcher wants. Yet another is trying to sync all that with how the human brain works.
Startups and leading tech companies, including search exemplar Google, are tinkering with new ways of culling and presenting information—ones that could prompt the next revolution in search.
"Because information is exploding, [the Internet] is going to become increasingly difficult to use if we dont get it right," said Liesl Capper, chief executive of Australian search startup Mooter.
Current technology troubles users like private investigator Cynthia Hetherington. When she suspected an Australian company recently of possible fraud, Hetherington turned first to Google. But then she went to the Australian Securities and Investments Commission, LexisNexis and Dun & Bradstreet.
Users who consider Google exhaustive are only fooling themselves, experts say. Todays search engines may be capturing as little as 1 percent of the Web, largely because of how they find and index online resources.
"Its very frustrating," said Hetherington, who runs a Haskell, N.J. company. "Its like going to a library and only pulling one book off the shelf."
Search analyst Danny Sullivan sees promise in developments to address such flaws, and he believes tomorrows search engines are likely to blend the best.
But he also cautioned that the Internet is littered with search innovations that failed to draw investors or market share.
Currently, all search engines fail to capture the bulk of the "invisible Web"—resources locked up in databases and inaccessible by the engines indexing crawlers. These include regulatory filings at the U.S. Securities and Exchange Commission, detailed reports on charities at GuideStar and complete archives of most newspapers.
Sometimes, accessing an "invisible" database requires payment. Search engines cant let you know about a documents availability for purchase if they cant scan it in the first place.
But even when a database is free, a site may require registration, prohibit search crawlers or use incompatible formats.
In particular, crawlers are stymied by dynamic Web pages, which are customized as users choose various options, such as car color at Cars.com.
To counter that, Chicago-based Dipsie Inc. is developing software that promises to fill out Cars.coms simple online forms, which are based on multiple choice, though not the complex ones for the governments patent and trademark databases, which require typing in keywords. A public test version is expected by summer.
Other companies are working to capture sound and video files that have troubled text-based crawlers.
StreamSage Inc. uses speech-recognition technology to transcribe feeds, so a search engine can pull out relevant portions of a long presentation. Company president Seth Murray said Harvards medical school and NASA already use the technology, but engineers still must speed it up for broader use.
Yahoo Inc. is going a less technical, more controversial route: Businesses can pay to ensure that their "invisible Web" pages get indexed.
But indexing more of the Web only brings up another challenge—identifying the most relevant among the billions of documents available. So some search developers are focused on personalizing and organizing searches.
Eurekster Inc., a startup launched in January, is marrying search with social networking, in which friends, your friends friends and their friends form online circles. Eurekster guesses what youre seeking based on what others in your circle have found relevant.
"At the moment, when you search on Google, everyone gets the same results for the same keywords," said Shaun Ryan, vice president of business development for Eurekster in New Zealand. "We try to personalize those results."
So, a search for "casting" might produce sites on movies if your circle is heavily in entertainment, fly fishing if members enjoy weekend outings.