A joint effort by Sybase Inc. and Sun Microsystems Inc. has made strides in reversing the rule that as a data warehouse grows larger, it becomes more expensive and more difficult to manage.
The companies last week announced that their reference architecture for a data warehouse with 48.2 terabytes of input data is the largest to be certified by an independent audit. InfoSizing Inc., of Colorado Springs, Colo., conducted an audit of the architecture on behalf of the companies.
Beyond the sheer size, though, the companies said the architecture allows companies to save as much as $1 million per terabyte compared with typical data warehouse implementations by requiring less storage capacity to operate.
The architecture, running on Sun midrange servers with Sun StorEdge 9960 systems and Sybases Adaptive Server IQ Multiplex analytical database, used 22 terabytes of storage. That is significantly less than traditional approaches, which require more than 300 terabytes of storage to support a similar-size data warehouse, said officials at Sybase, of Dublin, Calif. A combination of IQ Multiplexs compression and smart indexing technology was key to the lower storage requirement.
The reference architecture, introduced in November, was meant to demonstrate how customers could increasingly support larger volumes of data without incurring the cost of major redesigns of the data warehouse.
The Sybase-Sun architecture is enabling Kim Ross to build a 10-terabyte data warehouse with less storage than traditional relational databases. With it, Ross, CIO at Nielsen Media Research Inc., a subsidiary of VNU Inc., will be able to conduct faster queries and connect multiple host servers into one instance of the database. The data warehouse, which is still being deployed, will serve as the underpinning of a new application Nielsen Media is building. "This allows us to handle greater levels of details and provide better levels of performance, and both are huge benefits to our customers," said Ross, in Dunedin, Fla.
Competitors such as Teradata, a division of NCR Corp., downplayed the announcement of the 48.2-terabyte data warehouse, saying they have data warehouses in operation at customer sites with more terabytes of data and dont need to conduct audits to gain customer interest. Some customers run as much as 100 terabytes of raw data through a Teradata Warehouse system, said Teradata officials in San Diego.
IBM officials in Armonk, N.Y., made similar claims, pointing to two examples: Sprint Corp.s PCS unit has a 50-terabyte data warehouse, and State Farm Insurance Cos. has a 72-terabyte data warehouse.
Still, the Sybase-Sun announcement underscores the increasing size of data warehouses as customers collect and store ever more information about their business and customers. Sybase and Sun officials said they have about a dozen customers considering deploying the reference architecture.