IBM has loosed its “Viper” database upon the world, announcing on Wednesday an open beta for the next-generation, native XML/relational data version of its DB2 database.
Viper, the code name for DB2s next version, which does not as yet have a final name, has been in alpha testing since June.
The company also announced that it will extend early support of Viper to the PHP development community using Zend Core for IBM: an offering that integrates IBMs Cloudscape database and Zends PHP environment, each of which is based on open-source technology.
Viper (the company has not yet decided on a final name) contains native XML technology that does away with the traditional ways to handle XML data: by shredding or parsing it and putting data assigned to a particular tag into a column in a relational table, or by putting “blobs” of data into relational fields.
Both nonnative ways of handling XML are deficient as they stand on their own. Shredding XML means you lose the fidelity, or the hierarchy, of the XML document itself.
For example, if XML data comes from a Web application that includes an electronic signature thats associated with part of a form, its contained in the XML hierarchy.
But if you parse the XML content in rows across a relational table, that hierarchy is lost, and youre unable to pull that exact structure back out.
Blobs retain XML fidelity, but you lose the ability to search on data thats put into fields.
Bernie Spang, director of Database Servers for IBM, said that Viper gives the best of both worlds, as its XML handling capabilities allows users to retain the hierarchical, searchable form of XML.
Vipers native XML technology provides support for XQuery, an emerging standard language that extends XPath and is specially designed for processing XML data.
With Viper, applications can use XQuery, standard SQL or both to retrieve documents from either or both underlying storage formats.
“You can mix XQuery within a SQL query and vice versa,” Spang said. “You can write one query against the database and it can give an answer, some of which is in XML structure and some of which is in table structure.”
Thus, in the case of an insurance company, for example, queries can be made on account balances in which both tables and insurance claim forms are holding data.
Spang pointed to data from client surveys and compilation of other sources that show less than 20 percent of critical information is being stored in relational databases, compared with some 35 percent being stored in XML format.
XMLs Rapid Growth
Data in either stored form continues to grow at a rapid rate, but XML is growing more rapidly, as the industry shifts to XML-based form in OpenDoc and Microsoft Corp.s Office, for example, he said.
“You see industries across the board, including government, adopting XML standard formats for documents and information exchange,” Spang said, such as the use of electronic tax filing in government or the use of XML-based insurance forms.
IBM also unveiled new features of the database, claiming Viper will be the first database to support all three common methods of database partitioning—range partitioning, multidimensional clustering and hashing—at the same time, as a means to improve data management and information availability.
This will give customers the ability to structure data for optimal querying, Spang said.
For example, users will be able to store data by ranges, such as all information for a particular year.
Within that year, data can also be structured multidimensionally by quarter, say, or by regions, geography or country.
Philip Howard, an analyst at Bloor Research, put out a research note on Viper in which he agreed with the direction IBM is taking with Viper.
“IBM has concluded, rightly in my view, that using a relational approach is not adequate for processing XML,” Bloor wrote.
“Either you store it in relational format, in which case you get a major performance hit because you have to convert it to and from tabular format whenever you store or retrieve it, or you have to store it as a binary large object, in which case you cant do any processing with it.”
So, Bloor said, having two storage engines—relational and native XML—is the next logical step, with all that entails: separate tablespaces, indexes and so on.
On the other hand, Bloor said, database management components such as autonomics and the optimizer will all be held in common and sit above the two engines.
Having a database management layer on top and two database storage engines beneath that top layer raises the question of whether users might have more than two storage engines—the answer to which is, in principal, yes, Bloor said.
As far as marketing goes, Bloor thinks it likely that the XML storage engine will be offered as an optional extra.
“…There is obviously the possibility that you might want to license the XML database without the relational engine,” he said.
“As and when IBM moves the DB2 content repository to the new platform [something which has not been announced but which is an obvious next move], this could be a possibility.”
As far as the competition goes, Bloor said, IBM is leaving Oracle Corp. and Sybase Inc.—the database vendors with the “best current handle on XML”—well behind the curve.
“I expect to see Oracle, in particular, to froth at the mouth at this announcement,” Bloor said.
“It will no doubt declare that this is the wrong direction and the wrong road. In my opinion, it will be Oracle that is wrong: You just cant get both the necessary flexibility and performance that you need for XML unless you are prepared to move away from a purely relational approach. So any frothing at the mouth will be exactly that: froth and bubble.”
IBM also released details of partners plans to use Viper.
Justsystems Inc., an enterprise software provider, is working with IBM to deliver a solution for native XML applications based on Viper.
Justsystems has a front-end application platform, called xfy, that will provide native support for XML content handling and business intelligence.
In addition, Exegenix, maker of XML content conversion technology, is putting out Exegenix Document Migration Toolkit for DB2.
Customers, developers and partners can register for the Viper open beta program here.