Oracle Goes XML

Native XML storage moves Oracle into a new market and exerts big pressure on XML startups.

Oracle Corp. is the first among the big relational database vendors to make major changes to its database in response to XML, shaking up the generally overpriced and underperforming native XML database market something fierce but having a lesser effect on current Oracle database sites.

Oracle9i Database Release 2 (click here to see screen) continues to provide the largest range of features available in a database. It is a vast, sprawling product, and descending into it is a bit like falling down Alices rabbit hole: You keep running into new surprises, and youre not quite sure where the exit is.

In addition to its roles as a relational and an XML database, Oracle9i Release 2 is also—take a big breath—a Java application server (now supporting Java 1.3), a message queuing server, both a relational and multidimensional OLAP (online analytical processing) server, and a data mining server.

The Apache Software Foundations Apache HTTP Server is also tightly integrated into the product, although the e-mail and file server features that were part of Oracle9i Release 1s Internet File System component have been moved into Oracles new Oracle Collaboration Suite, announced last month.

Organizations wont necessarily save any money buying the expensive (though well-integrated) Oracle9i over a combination of several other products, although cost savings are realizable if all the features an organization needs are in the Standard Edition of Oracle9i Release 2. This lower-end version costs $15,000 per CPU, or $300 per named user, and includes some big-ticket features, such as XML DB and Java application server features.

In the native XML space, in particular, this pricing is very competitive. For example, Ipedo Inc.s Ipedo XML Database costs $29,000 per CPU, and Software AGs Tamino XML Server costs $45,000 per CPU.

However, Standard Edition lacks many of the features that motivate people to buy Oracle in the first place, including most data warehousing and data analysis features, most uptime and failover features, and Oracles excellent dynamic row-level security access controls. We advise organizations to download Oracles product family guide to be sure they understand the many places where Oracles two versions differ.

There is substantially more database functionality in the standard editions of IBM, Microsoft Corp. and Sybase Inc. databases.

Oracle9i Database Enterprise Edition (the version we tested) costs $40,000 per CPU, or $800 per named user. Only Enterprise Edition can be used with the seven database options Oracle sells, the most important of which are its Real Application Clusters clustering option, OLAP option, data mining option and partitioning option.

All seven options are separately priced. The OLAP option, for example, is $20,000 per CPU, or $400 per named user.

All the major database players are moving to strengthen support for XML data and XML query languages in their products. In the case of IBMs DB2 and Microsofts SQL Server databases, XML technologies and SQL will be on the same level as data access techniques. However, Oracle has gotten there first with its XML DB engine.

XML DB is a combination of three technologies: a large set of SQL functions that allows XML data to be manipulated as relational data (through a view or special SQL functions) as well as to retrieve relational table data in XML format; a native XML data type called XMLType that can store XML data either in an object-relational storage format that maintains the XML DOM (Document Object Model) or as the original text document; and a special hierarchical XML index type to speed access to hierarchies of XML files stored in Oracle9is XML file repository.

XML DB also supports XML Schema, the latest standard for defining the structure of XML documents, although it doesnt support the upcoming XML query language, XQuery. Instead, XML DB uses a combination of XPath and SQL to manipulate XML. The database includes an Extensible Stylesheet Language Transformation engine, made accessible through the built-in copy of Apache, that can retrieve XML data from XML DB and transform it into HTML or other formats.

Previous versions of Oracle and other relational databases support the option of storing XML as text data or extracting data from XML and storing it in normal relational tables, but the interim option of storing data in a format that maintains DOM fidelity (including comments, namespaces, the distinction between elements, and attributes and element ordering) is valuable and is the distinguishing feature of a native XML database. The DOM format doesnt require XML documents to be re-parsed when accessed, and this, in combination with XML and SQL index types, should provide good performance.

Oracle Enterprise Manager, Oracles management tool, has a new folder with which administrators can create and manage XML databases. These are defined using XML Schema documents marked up with additional Oracle-specific tags that define how a DOM maps to Oracle object types and tables.

Although learning and adding these tags will be a burden, doing so is critical: In eWeek Labs tests, importing a Schema document without the tags resulted in Oracle creating a separate table and object type for each different element tag in the file, an inefficient approach. Wed like to see better handling of unmarked documents in a future release. (Native XML databases such as Ipedo or Excelon Corp.s Excelon Extensible Information Server dont require special markup, although Tamino XML Server does.)

Oracle does an excellent job at making XML databases accessible, exposing them through Web DAV (Web-based Distributed Authoring and Versioning) and FTP interfaces. As a result, we could use Windows Explorer to view the XML database as a file system and simply copy files in and out of the database as if the database were a drive.

We tested a variety of Schema files and XML files to see how Oracle handled error conditions. We found we could create normal constraints (such as uniqueness or foreign keys) on the base Oracle tables, which were enforced at load and update time. We were not able to import XML files with extra tags or data types (such as a string where a number should be) not conforming to their Schema file.

However, we found that other attributes—such as minimum or maximum length, the number of times an element could be repeated, or the range of a number—were not enforced by Oracle. Oracle advised us to write a custom database trigger to get that level of validation.

We also encountered two places where we got internal ORA-00600 errors when working with XML data: once with the "distinct" keyword and once when importing an invalid document. The bugs were confirmed by Oracle and will be fixed in a future update.