During the 1992 presidential campaign, the headquarters of then-candidate Bill Clinton sported signs that said, “Its the economy, stupid.” In the same way, I wonder if IT strategy rooms ought to put up banners that read, “Its the data, stupid.”
In the last few weeks, Ive encountered many reminders that databases are still an area of innovation—as well as the strategic purpose and a controversial aspect of many large-scale information systems.
Database technology was the focal point of a recent special report in the science journal Nature. The report highlighted features of Oracles Oracle Database 10g that were aimed at life sciences needs. Regardless of whether youre into genetic analysis, the functions mentioned in the report should pique your interest. I can think of many enterprise applications for pattern-recognition capabilities, embedded machine-learning algorithms, regular-expression searching, support vector machines and advanced tools for mining unstructured text.
All are among the features that make Database 10g much more than a large-scale data repository. Old 1960s labels such as “electronic brain” come to mind—Database 10g doesnt just know stuff, it also thinks about it.
Speaking of old ideas, I remember a conversation I had about 15 years ago with Wayne Erickson of Microrim, an early pioneer in putting relational database power on desktops with R:base. Erickson talked about the challenge of searching for data that cant be described with numbers or text.
That challenge is being addressed today by a research project at MIT called iDeixis, which matches cell phone camera images against the attributes of other images found by Web-crawling servers.
In effect, the camera link asks the iDeixis server, “What have you seen that looks like this?” The “looks like” algorithm considers shapes, colors and other details.
What makes iDeixis interesting is the context of wireless connectivity and cheap digital imaging hardware; what makes it potentially useful, and not just a nifty demo, is the Webs vast database of imagery for comparison against what the camera sees. Consider the ease of applying this technology to enterprise applications with relatively small and well-defined image libraries.
For example, I remember once saying how much Id like to walk into a hardware store with a broken plumbing part, hold the part up to a camera at a database kiosk and see a floor plan of the store that would tell me where to get the replacement part. Thats coming soon, I suspect.
We still have a long way to go in making databases even more useful. Erickson noted during our conversation, for example, the difference between appearance and meaning. Theres a substantial difference between the requests “Show me that graph I saw last month of projected business growth rates” and “Show me that graph with the red and blue lines on it.” The latter is a much easier goal for simple algorithms, but the former would be much more valuable in business or other analytic settings.
Data annotation features can help us bridge that gap. Not only life scientists but also digital photographers and music collectors are putting pressure on database vendors to make such facilities richer and easier to use.
Life sciences needs are also driving database integration efforts like those in use at the European Bioinformatics Institute, where a single protein-structure query interrogates a diverse collection of independently developed data resources. These techniques could pave the way to databases that figure out what we want to know instead of merely answering the question that we ask.
Not all such database improvements will be universally admired. Controversy over the use of national identification cards, for example, is an argument more about integrating data than it is about the cards themselves. We dont object to showing a photo ID to get on an airplane, but some of us would rather not have that single ID become a gateway to everything from the medications we take to the books we buy.
When databases can answer more questions, they will also need better tools for deciding what questions someone will be allowed to ask.
Technology Editor Peter Coffee can be reached at firstname.lastname@example.org.