Survey: Biggest Databases Approach 30 Terabytes

 
 
By Matthew Hicks  |  Posted 2003-11-08
 
 
 

Survey: Biggest Databases Approach 30 Terabytes


Led by a surge in the amount of data being analyzed in data warehouses, the worlds largest databases are pushing new heights as they double and triple in size, according to a survey being released on Tuesday.

For the first time in database consulting company Winter Corp.s five surveys of the largest and most heavily-used databases, the largest decision-support database surpassed the biggest transaction-processing database. The flip-flop demonstrates the increasing importance of data analysis for enterprises as they try to better discern trends and patterns, said company President Richard Winter.

"The business drivers are the need to understand customer behavior in a more-detailed way so as to be able to increasingly predict whats going to happen," Winter said. "Doing that well requires increasingly detailed data, and that drives up the size of databases."

For its Top Ten Program, Winter Corp. gathers voluntary submissions from companies worldwide that are running large databases. The program requires that the databases must be in production and contain at least 1 terabyte of data (or 500 megabytes of data if running on Windows). The results, divided into 24 categories, are based on the amount of online data running on the database.

The largest decision-support database in this years survey is from France Telecom and handles 29.2 terabytes of data, triple the size of the top database in that category in Winters last survey in 2001.

Although they were eclipsed by their analytical brethren, the transaction-processing databases were far from laggards. The largest was from the United Kingdoms Land Registry, a government department overseeing land registrations in England and Wales, which reached a size of 18.3 terabytes. That is nearly double the size of the 2001 winner in that category.

Neither France Telecom nor Land Registry could be reached for comment for this story.

For database managers at AT&T Corp., the second-place winner in overall size for decision-support databases at 26.2 terabytes, the more dramatic rise in decision support database size should come as little surprise.

A year ago, AT&T began storing on its Security Call Analysis and Management Platform (SCAMP) database two years of detailed data on calls in its telecommunications network rather, than six months worth of calls. The change occurred because of increasing business requirements to retrieve and analyze more historical information, said Sandy Hall, division manager of real-time customer and service management in AT&T Labs.

"Each record contains 47 fields, so its a massive amount of data and is brought in and kept at the atomic level," she said.

SCAMP runs on AT&Ts Daytona database management system, which it sells commercially. While SCAMP is a decision-support database, it also handles transactions and must be available 24 hours a day, 7 days a week while being able to handle a daily data load of 400 million calls, Hall said.

Next page: Windows databases make the list.

Windows Databases Make the


List">

Winters latest database survey also reveals the rise of new platforms for running the worlds biggest databases. For the first time, a database running on Microsoft Windows reached the top 10 list for transaction processing databases, and Windows database grew the fastest in size.

Verizon Communications, which runs its transaction processing database on the Microsoft SQL Server database software on Windows, reached sixth place in size for all environments at 5.3 terabytes. It was the top transaction-processing database on the Windows platform.

"Windows is popular, but it has been seen as not very scalable," Winter said. "While its still not the architecture of choice for the most demanding of environments, its clearly become more scalable."

Database managers at Verizon said they have noticed scalability improvements in more recent version of Microsoft SQL Server. In fact, Verizon faced a doubling in the size of its top database in the past year as it began storing 13 months of data rather than 6 months, said Noah Gomez, lead development database administrator.

An upgrade to SQL Server 2000, from SQL Server 7.0, helped Verizon handle the big data jump. The company decided to partition the data into more manageable chunks, and the updated database software provided improved query optimization and query plans for data partitioning, said Jose Amando-Blanco, lead production database administrator.

Click here for information on Microsofts forthcoming version of SQL Server, code named Yukon. The company released a prerelease version late last month.

The winning database underlies a customer-care billing application for Verizon and is one of 11 databases, all on SQL Server, that comprise a system it calls Common Office Front End Engine, or COFEE. Three of those other database also reached Winters top 10 list for transaction processing databases on Windows.

As the databases have grown so has the need for the Verizon DBAs to closely monitor and manage them.

"You want to double your data, but you need to keep the same performance that the customer representatives are used to," Amando-Blanco said. "You need to be more proactive in finding out what the hot spots are on the database and which stored procedures are being called the most."

Next page: Who won Winters Top 10?

The Winners of Winter


Corp.s Top Ten Program">

The data deluge isnt expected to slow in coming years. Respondents to Winters survey predicted that the largest databases would reach about 60 terabytes by 2006.

In recent years, the prevalence of the Web has helped lead to new sources of data, such as click-stream information. Now a new set of devices, such as radio frequency identification (RFID) readers, are on the horizon and promise to retrieve more data that could stretch the limits of databases.

"What people talk about most is that computers are getting faster and cheaper and that storage is getting faster and cheaper, but there are also hundreds and thousands of devices planted everywhere that are getting faster and cheaper," Winter said.

Here are the overall winners of Winters Top Ten Program:

Transaction Processing (all environments)

  1. Land Registry, 18.3 terabytes
  2. BT plc, 11.7 terabytes
  3. United Parcel Service, 9.0 terabytes
  4. Caica Econômica Federal, 6.9 terabytes
  5. United States Patent and Trademark Office, 5.4 terabytes
  6. Verizon Communications, 5.3 terabytes
  7. Bureau of Customs and Border Protection, 4.1 terabytes
  8. Hewlett Packard Company, 3.2 terabytes
  9. Boeing Company, 3.1 terabytes
  10. CheckFree Corp, 2.9 terabytes

Decision Support (all environment)

  1. France Telecom, 29.2 terabytes
  2. AT&T, 26.3 terabytes
  3. SBC, 24.8 terabytes
  4. Anonymous, 16.2 terabytes
  5. Amazon.com, 13.0 terabytes
  6. Kmart, 12.6 terabytes
  7. Claria Corp., 12.1 terabytes
  8. HIRA, 11.9 terabytes
  9. FedEx Services, 10.0 terabytes
  10. Vodafone, 9.1 terabytes

More details of the winners and judging criteria for Winter Corp.s Top Ten Program is available here.

Rocket Fuel