XML Database Doubts

By Timothy Dyck  |  Posted 2001-12-03

XML Database Doubts

As XML becomes an increasingly important data interchange format, it makes sense to look at new ways of storing information directly in XML and using XML-based tools for data query and manipulation. However, what those tools will look like is still very much up in the air.

A native XML database market is emerging to tackle this need. XML databases such as Ixiasoft Inc.s TextML Server, Software AGs Tamino and XYZFind Corp.s XYZFind Server allow data to be submitted in XML format, provide XML-based query languages and return data in XML format. However, eWeek Labs tests show that, just because program data is in XML format, a native XML database is not necessarily the right place to store it.

Generally speaking, XML databases just arent technically strong enough to compete with relational databases—XML databases lack numerous administrative, interoperability, programmability and manageability benefits provided by the big relational databases.

Lack of clear standards is also a problem in the XML database space. The XPath query syntax unfortunately has no support for grouping, sorting or summarizing data, and the much richer XQuery query language is still in draft form. And even when XQuery is formalized, its likely that it wont support updates, inserts or deletions.

For early adopters of XML databases, this means increased costs until these issues are sorted out because the XML databases available now all use vendor-proprietary query languages and programming interfaces.

The Race Is On

The Race Is On

Its safe to say that, within the next few years, all database products will need to be able to quickly verify, store and retrieve data in XML format. More open to question is whether traditional relational databases can gain XML features faster than new XML databases can gain the scalability, programmability, reliability and manageability of the relational players.

Based on history and our experience, traditional relational databases will beat XML databases to the punch.

In 1996 and 1997, we saw relational players Oracle Corp., IBM and Informix Software Inc. (now part of IBM) add object database and Java language features to their relational databases to compete with pure object databases. In 1998 and 1999, these same vendors added a variety of extensibility features to store geospatial, text, image, HTML and time series data in their databases, basically killing the market for custom databases that were designed just to store one of these types of data.

Now, the relational database players are taking advantage of the past work that added object support, extensibility, Java and text manipulation to their products and are combining these strengths with their extensive research efforts into XML parsers and query languages. Long term, we think relational engines will be the right place for both XML and non-XML data.

Oracle, IBM and Sybase Inc. have all added an XML data type to their databases to store XML in native format. These vendors database products, in addition to Microsoft Corp.s Microsoft SQL Server, allow database administrators to parse XML data at import time and store that information in a set of relational tables. They also allow retrieval of data in XML format.

The main advantage of XML databases is their free-form, document- oriented storage engines. Theres no need to specify the structure of XML documents before storing them.

As a result, messy, semistructured data is handled well by native XML databases. Organizations with applications oriented around the storage of entire documents, such as manuals, brochures or Web pages, will find native XML databases the right tool for the Web.

In the short term, those with text-oriented applications will find native XML databases a good fit; for others, we recommend investigating what the relational players are doing, as this technology is changing almost week to week.

Rocket Fuel