Oracle Goes XML

 
 
By Timothy Dyck  |  Posted 2002-08-05
 
 
 

Oracle Goes XML


Oracle Corp. is the first among the big relational database vendors to make major changes to its database in response to XML, shaking up the generally overpriced and underperforming native XML database market something fierce but having a lesser effect on current Oracle database sites.

Oracle9i Database Release 2 (click here to see screen) continues to provide the largest range of features available in a database. It is a vast, sprawling product, and descending into it is a bit like falling down Alices rabbit hole: You keep running into new surprises, and youre not quite sure where the exit is.

In addition to its roles as a relational and an XML database, Oracle9i Release 2 is also—take a big breath—a Java application server (now supporting Java 1.3), a message queuing server, both a relational and multidimensional OLAP (online analytical processing) server, and a data mining server.

The Apache Software Foundations Apache HTTP Server is also tightly integrated into the product, although the e-mail and file server features that were part of Oracle9i Release 1s Internet File System component have been moved into Oracles new Oracle Collaboration Suite, announced last month.

Organizations wont necessarily save any money buying the expensive (though well-integrated) Oracle9i over a combination of several other products, although cost savings are realizable if all the features an organization needs are in the Standard Edition of Oracle9i Release 2. This lower-end version costs $15,000 per CPU, or $300 per named user, and includes some big-ticket features, such as XML DB and Java application server features.

In the native XML space, in particular, this pricing is very competitive. For example, Ipedo Inc.s Ipedo XML Database costs $29,000 per CPU, and Software AGs Tamino XML Server costs $45,000 per CPU.

However, Standard Edition lacks many of the features that motivate people to buy Oracle in the first place, including most data warehousing and data analysis features, most uptime and failover features, and Oracles excellent dynamic row-level security access controls. We advise organizations to download Oracles product family guide to be sure they understand the many places where Oracles two versions differ.

There is substantially more database functionality in the standard editions of IBM, Microsoft Corp. and Sybase Inc. databases.

Oracle9i Database Enterprise Edition (the version we tested) costs $40,000 per CPU, or $800 per named user. Only Enterprise Edition can be used with the seven database options Oracle sells, the most important of which are its Real Application Clusters clustering option, OLAP option, data mining option and partitioning option.

All seven options are separately priced. The OLAP option, for example, is $20,000 per CPU, or $400 per named user.

All the major database players are moving to strengthen support for XML data and XML query languages in their products. In the case of IBMs DB2 and Microsofts SQL Server databases, XML technologies and SQL will be on the same level as data access techniques. However, Oracle has gotten there first with its XML DB engine.

XML DB is a combination of three technologies: a large set of SQL functions that allows XML data to be manipulated as relational data (through a view or special SQL functions) as well as to retrieve relational table data in XML format; a native XML data type called XMLType that can store XML data either in an object-relational storage format that maintains the XML DOM (Document Object Model) or as the original text document; and a special hierarchical XML index type to speed access to hierarchies of XML files stored in Oracle9is XML file repository.

XML DB also supports XML Schema, the latest standard for defining the structure of XML documents, although it doesnt support the upcoming XML query language, XQuery. Instead, XML DB uses a combination of XPath and SQL to manipulate XML. The database includes an Extensible Stylesheet Language Transformation engine, made accessible through the built-in copy of Apache, that can retrieve XML data from XML DB and transform it into HTML or other formats.

Previous versions of Oracle and other relational databases support the option of storing XML as text data or extracting data from XML and storing it in normal relational tables, but the interim option of storing data in a format that maintains DOM fidelity (including comments, namespaces, the distinction between elements, and attributes and element ordering) is valuable and is the distinguishing feature of a native XML database. The DOM format doesnt require XML documents to be re-parsed when accessed, and this, in combination with XML and SQL index types, should provide good performance.

Oracle Enterprise Manager, Oracles management tool, has a new folder with which administrators can create and manage XML databases. These are defined using XML Schema documents marked up with additional Oracle-specific tags that define how a DOM maps to Oracle object types and tables.

Although learning and adding these tags will be a burden, doing so is critical: In eWeek Labs tests, importing a Schema document without the tags resulted in Oracle creating a separate table and object type for each different element tag in the file, an inefficient approach. Wed like to see better handling of unmarked documents in a future release. (Native XML databases such as Ipedo or Excelon Corp.s Excelon Extensible Information Server dont require special markup, although Tamino XML Server does.)

Oracle does an excellent job at making XML databases accessible, exposing them through Web DAV (Web-based Distributed Authoring and Versioning) and FTP interfaces. As a result, we could use Windows Explorer to view the XML database as a file system and simply copy files in and out of the database as if the database were a drive.

We tested a variety of Schema files and XML files to see how Oracle handled error conditions. We found we could create normal constraints (such as uniqueness or foreign keys) on the base Oracle tables, which were enforced at load and update time. We were not able to import XML files with extra tags or data types (such as a string where a number should be) not conforming to their Schema file.

However, we found that other attributes—such as minimum or maximum length, the number of times an element could be repeated, or the range of a number—were not enforced by Oracle. Oracle advised us to write a custom database trigger to get that level of validation.

We also encountered two places where we got internal ORA-00600 errors when working with XML data: once with the "distinct" keyword and once when importing an invalid document. The bugs were confirmed by Oracle and will be fixed in a future update.

OLAP


OLAP

Oracle continues to be outpaced by Microsoft in the OLAP space. Although Oracle is stronger in the relational OLAP space, Microsofts multidimensional OLAP server is sophisticated, has broad third-party industry support and is included at no extra charge in its database. Its even very easy to use the OLAP Server included with Microsofts SQL Server to do analysis on data located in Oracle databases.

Both Oracles OLAP and data mining components use APIs that are new (as in Oracle9i Release 1, which shipped last summer) and so far poorly supported by other data query and analysis tool vendors.

In addition, Oracle provides no built-in data gateways to import, export or query data stored in non-Oracle databases, something Microsoft and IBM include as standard in their database servers. Oracle can import just text or XML files on its own; native third-party database gateways are available but sold separately.

Oracle9i Release 2 includes several features that are attractive but not so compelling that they would incite upgrade plans.

For example, Oracle9i Release 2 provides features that improve data warehousing, including the DataGuard Logical Standby Database option, which lets organizations use a standby database to process database queries.

Using this option, the standby database is run in normal mode and is updated using normal SQL commands, so administrators are also able to add indexes or materialized views and use the database for reporting or for data analysis queries. Previously, the standby database had to be run in recovery mode, which does not allow user queries.

Another data warehousing win is support for data compression. Data compression of repeated values in columns allows more data to fit on a disk block and so improves speed for queries that need to access a large set of rows. Compressed blocks also allow more data to fit into Oracles buffer cache, thus lowering memory demands.

The combination of block compression, bit map indexes (another unique Oracle feature) and materialized views (which are supported by DB2 and Microsofts SQL Server) makes Oracle a great platform for in-place, relational OLAP or data warehousing jobs.

Oracle9i Release 2 also provides more help in setting database parameters for optimal performance and makes it easier to access Oracles data recovery tools through a new SQL extension called Flashback Query.

While Oracle9is Log Miner continues to be the more general-purpose tool for reading and undoing past changes from the log, the Flashback Querys powerful "as of" extension to the SQL select command let us undo mistakes or look at older versions of data from a normal SQL query tool.

eWeek Labs West Coast Technical Director Timothy Dyck can be reached at timothy_dyck@ziffdavis.com.

Executive Summary


: Oracle9i Database Release 2 Enterprise Edition">

Executive Summary: Oracle9i Database Release 2 Enterprise Edition

Usability Good
Capability Excellent
Performance Excellent
Interoperability Poor
Manageability Good
Scalability Good
Security Excellent

Oracle takes on a new identity in this release—as a native XML database—providing significantly more capabilities for organizations storing or manipulating XML-based data. Incremental improvements have also been added in the way of data warehousing, administration, fault tolerance and OLAP. Overall, Oracle9i Database Enterprise Edition continues to lead the database market in terms of sheer functionality.

COST ANALYSIS

Oracle continues to be the most expensive database on the market by a significant margin: Enterprise Edition costs $40,000 per CPU, or $800 per named user, and Standard Edition (which is significantly less feature-rich than Enterprise Edition) costs $15,000 per CPU, or $300 per named user. The XML database features are included in both editions, making Standard Edition less costly than most XML databases on the market (where outlandish pricing is the norm).

(+) Native storage of XML data and an XML document repository accessible through a number of common protocols; provides easier ways to restore deleted data using SQL undo extensions; failover database can also be used to process queries, allowing existing hardware to get more use; adds data warehousing improvements; compression of repeated values will speed report generation.

(-) Lacks support for XML Query; high cost; lacks built-in support for importing, exporting or querying data in other databases; OLAP and data warehousing APIs are still quite new and have little support outside of Oracle.

EVALUATION SHORT LIST

  • IBMs DB2
  • Microsofts SQL Server
  • Sybases Adaptive Server Enterprise
  • Software AGs Tamino XML Server
  • Ipedos Ipedo XML Database
  • www.oracle.com/ip/deploy/database/oracle9i

    Related Stories:

  • Commentary: Microsoft, Oracle in Benchmark War
  • Commentary: Oracles Secure, but Rest of World Isnt
  • XML Stores Get Richer Queries
  • FileMaker Pro 6 Adds XML Support

  • Rocket Fuel