XML Standards Updated

Key XML Query 1.0 spec moves closer to completion amid controversy over lack of update features.

The all-too-familiar struggle to satisfy time-to-market simplicity and final-feature-set criteria is in full swing in several key XML standards bodies, the results of which will affect all users of XML.

The World Wide Web Consortium just finished one of its busiest periods ever, with 27 publications released last month.

Several of these proposals were releases of new or updated working drafts for key forthcoming XML standards, including XQuery 1.0; XPath (XML Path Language) 2.0; XSLT (Extensible Stylesheet Language Transformations) 2.0; and XML 1.1, an update to XML itself.

XML-based technologies have become so important to so many powerful organizations that it is now quite difficult to find consensus on how to define higher-level search and data manipulation techniques for XML. As a result, there are a number of overlaps among different standards that provide dissimilar ways of doing similar things.

For example, both XPath and XQuery provide ways to search through an XML document and return found data (for example, to search through a list of customer records to find those records where the state element is equal to "WA" and the credit check element is equal to "passed").

Likewise, XQuery and XSLT provide different ways to write logic to change the format of XML documents (for example, to reorder elements in an XML document or to generate different kinds of markup from the same source file).

The most recent round of standards setting tries to eliminate some of these inconsistencies. XPath 2.0 is now a subset of XQuery 1.0 work, and the goal is for XPath 2.0 expressions to be fully compatible with XQuery 1.0 and generate exactly the same search results. The December XPath 2.0 working draft even states that the XPath 2.0 and XQuery 1.0 working draft documents are generated from common source files.

This convergence of XPath and XQuery (and of XPath and XSLT 2.0) is prompting changes in XPath, and early adopters using XPath 1.0 will have to recheck their XPath queries to make sure they still work as intended in XPath 2.0. (Proposed changes can be found at www.eweek.com/links.)

Those using XML databases should pay close attention to these changes. Despite XPaths many limitations (such as the lack of grouping, sorting and programmability), its the only XML query language available and so has been widely adopted among XML databases as a central technology.

Update Support Unlikely

With the publication of XML Schema as a W3C Recommendation last May, XQuery is now the most important XML standard under development.

Although clearly better than XPath as a query technology, XQuery itself is also missing critical features. The main gap is lack of support for updates, inserts or deletions of XML data, which means organizations will still have to write custom program code to modify XML data instead of using much simpler XQuery commands.

This is a contentious issue because there are fears that not having updates standardized will lead to vendor-proprietary efforts to add update features to XQuery, thus fragmenting the market. Of course, delaying XQuery means the current fragmentation in the query space will only get worse.

"For me, personally, updates are an extremely high priority," wrote Jonathan Robie, one of the XML Query specification editors, in a posting to xml-dev, an XML mailing list. "I am concerned about the likelihood that several similar implementations may hit the market before there is a standard for updates. But I am also very concerned that XQuery 1.0 be released relatively soon."

Updates are technically challenging to add because of their complexity (particularly if XML document structures can be updated as well as the data they contain), but their absence is a big stumbling block for those implementing XML-based data storage in production applications. Developers writing document management or other read-heavy, write-light applications arent as dependent on XQuery gaining update features soon.

XML, the basis for all this work, is also being updated with a proposed 1.1 release. Just two major changes are in the update.

The first is to upgrade Unicode character support to the current Unicode 3.1. XML 1.0 supports Unicode 3.1 characters in data but restricts metadata such as tag names to Unicode 2.0 characters (at 94,140 characters, Unicode 3.1 is close to triple the size of Unicode 2.0). This change wont affect many developers but is appropriate for an international standard as fundamental as XML.

The second change is to make the line-end character sequence used on OS/390 systems a legal line-end symbol in an XML file. This would let OS/390 users edit XML files using native text editing tools and transfer XML files generated on mainframe systems to other systems without any line-ending conversions.

This will save money for those using XML on mainframes but, unfortunately, will require everyone else to update their parsers to handle this new file format. Understandably, there is some opposition to this plan.

West Coast Technical Director Timothy Dyck can be reached at [email protected] ziffdavis.com.