Its only fitting that the largest retailer should have the worlds largest database, but at more than one-half a petabyte, thats a lot of information, even for Wal-Mart.
The vendor that is supporting those many bytes of data—NCRs Teradata division—begged for the extraordinary permission from the normally secretive Wal-Mart to announce this achievement Wednesday to make a point: It is arguing that its systems can scale without hiccups even at an extreme number.
But Wal-Mart being Wal-Mart, its not saying much. While confirming that it does even now have the worlds largest datawarehouse—and that it permitted its supplier to announce that—it wont say anything other than “to acknowledge an important milestone,” said Gus Whitcomb, Wal-Marts director of corporate communications. He referred questions to Teradata, saying its their announcement.
Beyond issuing a news release that Wal-Mart is “increasing its lead as the largest retail data warehouse in the world,” it gave no details as to the size or specifics. The “more than 500 terabytes” figure came from a source who didnt want a name or a company linked to the figure.
The statement did, however, point out that this massive data warehouse is not solely a customer CRM system, but also serves as the base for Wal-Marts Retail Link decision-support system between Wal-Mart and its suppliers. Retail Link allows suppliers to access large amounts of online, real-time, item-level data to help those suppliers improve operations.
Back at Teradata, officials are prohibited from discussing what they have done for Wal-Mart, but one vice president did take the opportunity to argue what it means from an IT perspective.
“The issues we encounter at Wal-Mart are really not all that different from smaller retail data warehouses,” said Rob Berman, vice president of Teradatas retail operations. He contrasted Wal-Marts current data warehouse size with its earliest stage, when it was literally less than one-thousandth of its current size.
“When Wal-Mart started with a 320-GByte data warehouse, it used one database administrator [DBA]. Today, the number of DBAs is still fewer than five,” Berman said.
Unlike a typical database that can get slower as it expands—and requires more time to complete backups and virus scans, for example—Berman argues that Teradatas approach sidesteps those growth issues. “Our system is nearly 100 percent linear-scalable. Its designed to scale without the management restrictions of other databases.”
How so? “Every time we add a node, we add an equal amount of bandwidth,” he said. “Every time we add a component of processing power, we add another component of bandwidth. We just grow the highway. Every time they grow in DASD [direct-access storage device], we add I/O bandwidth.”
Retail Center Editor Evan Schuman can be reached at [email protected]