IBMs research arms also have been hard at work to lay the groundwork for the upcoming revolution in search. Its research projects in this area have included Clio, wherein IBMs Almaden Research Center has been working to develop a tool that will enable its Content Manager software to more easily index and search XML data. Some 18 months ago, IBM disclosed that Clio would result in a tool called Cinnamon. Cinnamons raison dêtre has to do with current problems in placing queries to XML-tagged data. Querying has required proprietary code that either doesnt take full advantage of the XML format or cannot be used consistently, IBM executives said at the time. In IBMs T.J. Watson Research Center, research has long been under way on Web Fountain, the Web services version of IBMs UIMA (Unstructured Information Management Architecture), a technology based on artificial intelligence techniques that goes out on the Internet, crawling around and reading text and then interpreting it.Indeed, anybody whos listened to sales spiels of the two database giants knows that theyre going after the same targets: customers who need to get a handle on data that doesnt neatly sit in the rows and columns of structured databases. Some studies estimate that about 81 percent of what enterprise users do as part of core business processes still requires them to deal with physical or digital documentsdocuments that have been hard, if not impossible, to access, search through or store in traditional relational databases. Think of the paperwork that an insurance or financial services company still handles: claims processing, loan origination or new account on-boarding. Andy Warzecha, an analyst at The META Group, based in Stamford, Conn., said there are two trends now driving enterprises to get a grip on that data. First, theres just far more stuff to deal with than ever, and its production isnt slowing down. "While we have organizations struggling to decrease the processing times they go through in core business processes, the counterpoint is theres more stuff they have to look at in their core business processes," Warzecha said. But while its nice to work efficiently with less paperwork, the real driver behind getting a handle on search is the sharp increase in regulatory demands over the past 18 months, Warzecha said. Between Sarbanes-Oxley, HIPAA (Health Insurance Portability and Accountability Act), Basel II, OSHA (Occupational Safety and Health Administration) and other post-Enron inspired regulations, all of that paper, all of those images and all of that unstructured data now has risk and penalty tied to it. "The first stuff [typically] subpoenaed is from the e-mail environment," he said. "Theres stuff there that shouldnt be. Organizations that are supposed to be destroying things arent doing a good job at that." Next Page: Niche players versus the giants.
IBM is not the only one whos hot on the search trail, either. Oracle unveiled the results of years of research into these same technologies at its OpenWorld conference earlier this month, including its Oracle Files 10g enterprise content management technology.