DataPower Technology Inc. announced that is has broken the terabyte barrier and its technology is able to transform a 1TB XML document.
Cambridge, Mass.-based DataPower is expected to announce on Monday that its DataPower XA35 XML Accelerator appliance can handle fully streaming XML processing on documents up to 1TB. DataPower officials said streaming processing allows an XML engine to begin producing output before the entire input has been parsed and requires only a constant amount of memory, independent of XML document size.
Streaming processing was previously only possible using low-level custom programming or special-purpose languages. However, DataPowers approach enables XML developers to use XPath and XSLT (Extensible Stylesheet Language Transformations) standards, while enabling XML documents of unlimited size to be processed, company officials said.
“In the simplest form, before streaming, processing an XML file required reading the entire file into RAM,” said Eugene Kuznetsov, chairman and chief technology officer at DataPower.
“Because an XML file in RAM is usually several times larger than on disk, that could mean that a 100MB file, for example, may require 300MB or more of free memory. Obviously, this presents a physical limit on the maximum size of a file that can be processed. With streaming, only a portion of a file has to be in-memory at any one time, and output can be produced before the entire file has been consumed. DataPower streaming compiler technology means there is no maximum size limit and XML files over a terabyte long can be processed.”
However, some XML processing operations cant be streamed because of their very nature, Kuznetsov said. For example, an XSLT transformation that reverses the order of elements in a file has to buffer all of the input and wait for the last element before it can produce any output, he said. Also, it is often difficult for a developer to determine whether a particular set of XML processing operations can be streamed and to know how later changes may affect “streamability,” he said.
However, Kuznetsov said DataPowers compiler technology automatically determines which operations can be streamed and processes them in a fully streaming fashion. And with the DataPower XG4 technology and chip set, users do not have to learn new languages or determine which processing can be streamed, company officials said.
DataPower XG4 has been shown to transform a 1TB XML document in XSLT streaming mode, and breaking the “1TB barrier” for XML document size is a milestone, Kuznetsov said. DataPower also was first to break the “1G-bps barrier,” he said.
“Its really the final frontier in XML processing. No one can say that its too slow (gigabit barrier!) or too verbose (terabyte barrier!) for any application,” Kuznetsov said.