IBM Takes Search to New Heights

By Darryl K. Taft  |  Posted 2003-08-11 Print this article Print

IBM is working on new search technology that may eventually give Google Inc. a run for its money.

IBM is working on new search technology that may eventually give Google Inc. a run for its money, at least in the corporate space.

Officials at IBMs T.J. Watson Research Center here discussed with eWEEK this month how it is tackling the problem of understanding unstructured data. Using a combination of artificial intelligence techniques, IBMs UIMA (Unstructured Information Management Architecture) is the foundation for what Paul Horn, IBM senior vice president and director of research, calls "Google on steroids."

Check out eWEEK Labs review of Googles new search appliance.
UIMA uses what officials call a "combination hypothesis" to deliver knowledge and understanding to bulk amounts of unstructured data. The combination of technologies solves a search problem by using them in unison in a kind of brute-force attack.

Salim Roukos, manager of multilingual natural language processing research at IBM Research, in Hawthorne, N.Y., and an expert in machine translation, said, "Through UIMA, the components [to approach the problem] were more readily available. And what were working on now is extending UIMA to support an effective way of combining these components."

IBM has developed three systems based on UIMA. The first, internally called Jedi, is a pure Java version of the framework; another is a C++ version. A third, which is the most likely to go into broader use and into products in some form, is called Web Fountain and uses a Web services approach.

Horn said Web Fountain "goes out on the Net, crawls around and reads text. It reads the text, understands the text and will tell you whats in the text."

Web Fountain also features natural language functionality, which allows it to find correlated subjects. "Weve done this for a number of big companies; one is British Petroleum [plc.]," Horn said. "We went out and found out what people were saying about them."

IBM isnt the only big player seeking to steal Googles thunder: Check out this scoop on Microsofts Web crawler. UIMA is an example of how IBMs research arm works with its product and services divisions to turn out new offerings in each, officials said.

"Were using this technology to differentiate our consulting practices," said Horn. "But the core search technology is going into our software products, such as the portal," and other technologies.

Next page: How will UIMA help customers?

Darryl K. Taft covers the development tools and developer-related issues beat from his office in Baltimore. He has more than 10 years of experience in the business and is always looking for the next scoop. Taft is a member of the Association for Computing Machinery (ACM) and was named 'one of the most active middleware reporters in the world' by The Middleware Co. He also has his own card in the 'Who's Who in Enterprise Java' deck.

Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel