XML Comes Home to the Database

Innovations 2006 Analysis: Used to be, friends didn't let friends use stored procedures and triggers. Wisdom was, databases were just those boring, clunky old things used to crunch numbers and process basic tex

Editors Note: This story is part of , a continuing series of stories from the reporters and editors of Ziff Davis Internet. Instead of the usual mile-high look at the year ahead, these articles examine particular technologies and markets in transition, including whats in store for them.

Used to be, friends didnt let friends use stored procedures and triggers. Application logic should happen in the application tier, the wisdom went, and databases were just those boring, clunky old things used to crunch numbers and process basic text.

Well, all thats changing, because this is the year when XML comes home.

Home, as in, to its rightful home—roosting right where database administrators want it: in the relational database.

"Database people have been waiting for this to happen for a decade or two," said Peter OKelly, an analyst with Burton Group. "Theyve been waiting for everybody to realize that everything should be stored in the DBMS [database management storage system]."

Big database vendors are just at the start of a new stage of innovation and competition with regards to wrangling both structured, i.e., relational, data, and unstructured data, as represented by XML, in the same box.

Microsoft Corp. arguably beat IBM out the door with support for XQuery and a native XML data type in SQL Server 2005, which became available in November, while Oracle Corp. boasts of being the first to support XQuery.

But IBM is distinct in packing a revolutionary concept with its dual-storage approach to handling XML, OKelly said. This approach is now at the open beta stage in IBMs next version of DB2, code-named Viper.

All the big relational database vendors are shying away from traditional means of handling XML in the relational database. Up until very recently, non-native ways of handling data have included shredding or parsing, wherein data is assigned to a particular tag in a column in a relational table, and putting "blobs" or "CLOBS" of data into relational fields.

But shredding XML means you lose the fidelity, or the hierarchy, of the XML document itself. Blobs retain XML fidelity, but they prevent you from searching on data thats put into fields.

/zimages/5/28571.gifClick here to read an eWEEK Labs review of Microsoft SQL Server 2005.

The most serious problem with these approaches is performance, according to Thore Thomassen, senior enterprise architect for Storebrand, a major supplier of insurance and financial services based in Oslo, Norway. Storebrand is an early user of Viper.

"Because most relational databases are normalized, the shredding gets very complex, and some of the most advanced SQL we have in Storebrand represents mappings between XML and a relational model," he wrote in an e-mail interview with Ziff Davis Internet News. "In some cases this mapping is not possible or the performance is too bad for it be used (we had one query that took minimum 10 minutes to complete, 30 max)."

In such cases, Storbrand stores XML as a CLOB. The problem there is you can neither access nor query the XML content.

The other big problem with shredding, Thomassen wrote, is that Storebrand loses most of the flexibility offered by XML.

"The result of trying to solve this problem is complex, flexible relational models, which are very hard to maintain," he wrote. "Examples of this are tables with 70 columns where only 10 are used or complex metadatabases where the data model is defined on the fly."

Bernie Spang, director of Database Servers for IBM, has said that Viper gives the best of both worlds, as its XML-handling capabilities allow users to retain the hierarchical, searchable form of XML. Vipers native XML technology provides support for XQuery, an emerging standard language that extends XPath and is specially designed for processing XML data.

With Viper, applications can use XQuery, standard SQL or both to retrieve documents from either or both underlying storage formats.

SQL Server 2005 keeps XML and relational data in one storage engine. Microsoft has set it up so you can insert SQL queries into XQueries and vice versa—a technique that one SQL Server 2005 user said gains more power for the developer.

"… It allows the developer to cross over the means of programming (XQuery and TSQL) and also helps the engine, using indexing and query optimization over the whole data," said Gerald Schinagl, diplomized engineer and project manager for the Sports Database at Austrian Broadcasting Corp. Radio & Television, in an e-mail exchange.

Next Page: Upgrading for XML support.