The Ask Master

An XML technology makes retrieving web data much easier

A forthcoming XML query language will do for XML documents what the Structured Query Language did for relational databases — and it may become the basis for many of the business-to-business interactions on the Internet, experts say.

The language, called XQuery, is being developed by a technical committee of the World Wide Web Consortium (W3C), which expects to finish hammering out the details of the specification by years end, says Michael Champion, senior research and development adviser of Software AG, vendor of the Tamino XML Database.

By empowering searches of any XML-based content, says Rick Nadler, architect of Borland Softwares Web services technologies, XQuery moves us a step closer to "the ultimate fantasy: the idea that you could be querying the entire World Wide Web."

A forerunner of XQuery is SQL (pronounced SEE-kwel), the first standard language that followed the rules of relational algebra for storing and retrieving data in any database system. Oracle, an early adopter of SQL, seized an early lead in the database marketplace and never looked back. XQuery, conceptually similar to SQL, will offer a query launching system that can be used to extract information from a set of XML documents or an online registry of XML-defined services.

One way XQuery differs from SQL is that SQL separates information about data from the data itself and calls it metadata. XQuery can combine both data and a description of the data, which means that a system retrieving it can understand what sort of data it is working with without further human intervention, says Nelson Mattos, IBMs director of information integrity. "One of XMLs beauties is that XML documents are self-describing," Mattos says.

Software AG is among the contributors to the W3Cs XQuery Working Group, and the company offers a prototype implementation of XQuery on its Web site. Its Tamino XML Database is based on an XQuery predecessor, XML Path Language — called XPath — a W3C standard that provides minimal XML data handling. Tamino XML Database will be upgraded to use XQuery as work on the standard progresses, Software AGs Champion says.

IBM and Microsoft are also strong supporters. They jointly submitted the XQuery draft specification to the W3C last year. IBMs Don Chamberlin, one of the original authors of SQL, is chairman of the XQuery Working Group.

Work is now under way on many specific definitions within XQuery. Functions and operators of the query language — such as a command to seek out a particular form of XML tagging — were included in the XQuery 1.0 draft released on Aug. 27.

Increasingly, Web site content is being built with XML-based standards such as eXtensible HTML, which combines HTML 4.0 and XML 1.0 in a single format. That means more content is available that can be queried by XQuery-based systems, says Robert Weideman, vice president of marketing of Cardiff Software, a supplier of LiquidOffice eForm Management System, an XML-based business forms handling system.

XQuery may also play a key role in the creation of Web services based on one piece of software querying another, then processing the XML-formatted data that it provides, Borlands Nadler says. He foresees Web services in which XQuery-based systems talk to registries of services. A standard for such a registry exists in Universal Description, Discovery and Integration, an XML-based standard. Whats been lacking is a query language for UDDI, he notes.

The End of Browsing?

If more content can be presented in XML format, then the Web will begin to leave behind its massive search results and "endless browsing paradigm" and begin to connect users more precisely to the data for which they are looking, Nadler says.

Still, IBMs Mattos warns, "Theres a very large list of technical issues that still need to be resolved." Right now, XQuery performs more as a content discovery and selection language. Members of the W3Cs XQuery Working Group are debating how much functionality they should build into it.

The language offers the ability to search across different forms of XML data storage and come up with relevant data, according to Software AGs Champion. For example, he says, an easy-to-understand XQuery system could query both a repository of e-mail converted into XML documents and a relational database, and bring back data from both. Under the covers, the XML query would be transformed into SQL to retrieve the data from a relational database system such as Oracle or Microsoft SQL Server, he notes.

Few experts, however, expect XQuery to emerge as a carbon copy of SQL for XML data. XQuery can find XML-tagged data and work with other XML technologies to perform additional actions. But no one is ready to say that XML and XQuery will become the basis of future transaction systems, as SQL and relational databases are today. Rather, XQuerys emphasis is on providing a core set of SQLs functionality — namely SQLs "select" function, used to retrieve data; its "join" function, which pulls together data from different tables in the relational database; and its "update" function, in which existing data is changed by a query.

In practical terms, XQuery promises to let users find the information they need much more easily. Instead of having to format a precisely configured SQL query, which requires a programmers skills, a user will simply name the topic of the file he or she is seeking, and XQuery will survey XML databases or file systems and find it. And to most users, theres no question that sounds like a better way to track down information.