Powerset Holds Promise for Semantic Search

 
 
By Clint Boulton  |  Posted 2008-05-13
 
 
 

Semantic search startup Powerset launched its first product May 12, a search engine focused on finding articles on Wikipedia.

The technology has wowed analysts, some of whom proclaimed it a potential Google killer in the hands of Microsoft, a company capable enough to wield it. Those comments came amid rumors that Powerset is being pursued by would-be acquirers.

Powerset executives wouldn't comment on the acquisition talk, but downplayed the claims about being a Googleslayer. After a test tour of the software, which uses natural language processing to return not just exact words used in queries, but synonyms for those words, eWEEK agreed.

Right now, it is better suited for helping students research term papers because it provides more rounded results and aggregates them on a page for the user.

While Google, Yahoo and Microsoft leverage keywords to help users find what they are looking for, Powerset lets users enter keywords, phrases, or questions. Imagine AskJeeves if AskJeeves accomplished what it set out to: answer questions by better understanding what the questioner meant.

The responses don't just yield a group of blue links, but provide search results culled from information from across multiple Wikipedia articles.

Scott Prevost, Powerset general manager and director of product, told eWEEK that Powerset's natural language search approach parses each sentence, extract meaning from it and indexing it. When users type queries, the search engine matches meaning to meaning.

"Wikipedia search is pretty good at finding topics, but if you want to find things deep in the pages, it's not too great," Prevost said. "You can do that in Google, but then you're jumping back and forth between the sites, and Google doesn't do aggregated things like this."

For example-I tested it in Opera 9.5-a search of Henry VIII pulls up a dossier of information pulled from Wikipedia and Freebase, an open database of the world's information. The bottom half of the page contains "factz," or snippets that are automatically extracted from Powerset's indexing of the article text.

A click on "status" returns sentences about status that Henry VIII granted during his reign. The cool thing is that these aren't from the Henry VIII page on Wikipedia, but culled from other areas on Wikipedia. It's kind of using the index of an encyclopedia to find related nuggets of information without the aggravation of flipping through the book.

For a more complicated search, I queried "When did earthquakes hit Tokyo?" This returned highlights of snippets of data that semantically matched the query. Moreover, the third result highlighted a reference of an earthquake that "struck" Tokyo, picking up that verb as a synonym for "hit."

In addition, clicking an arrow tab next to each result presented a mini-viewer, which lets users see the snippet in the context of the whole Wikipedia article.

There are also a couple interesting navigation tools, called "clear highlighting" and "explore Factz," which present with a tag cloud of facts, providing almost Cliff's Notes feature for each article.

Powerset is limited to Wikipedia and Freebase, though it has designs on adding content from multiple Web sources. Until that time-and that could take years-this cannot be considered a Google killer. 

However, the San Francisco company has plans to tie search to online advertising to make money, but officials are mum on those plans, save to say it won't be long.

The equation for Powerset will be better search + scale + ad revenue = success.

In light of the billions of pages Google has indexed in the last 10 years, seems like a staggering feat if it can be achieved. One wonders if the company will be left alone to achieve its goals to add content and become a multi-billion document index.

Rocket Fuel