Almost a year and a half since it was published, the database server benchmark continues to be among our top 20 most popular Labs stories online. In addition, weve received several hundred e-mail messages commenting on various aspects of the story, requesting tuning advice, thanking us for the information, or requesting that the test be performed again using a larger or different set of products or testbed platforms.
We dont have plans to run the test again in the near future (the evaluation took several months to put together and was a thorny nest of complexity), but we can advance the story by putting into context later benchmark results from IBM and Microsoft Corp.
The differences between our tests and the results IBM and Microsoft saw in their own labs using our code demonstrate how myriad environmental and coding factors can affect test results. For example, the combination of a batch statement in Microsoft SQL Server (which has a side effect of requiring use of a client-side cursor) with bidirectional result set scrolling will slow performance considerably even though the functional result will still be correct.
Determining where these subtle performance problems hide is an important task for benchmarks. They are also important evaluation tools for locating performance bottlenecks, identifying problem-prone approaches and for capacity planning.
The best benchmark, of course, is always one based on an organizations own code and infrastructure. This information, combined with third-party benchmarks using a variety of workloads, provides the best insight into how a product will perform over time and under varying conditions.
The biggest unexplained mystery during our database server evaluation was the sharp drop-off we saw when testing IBMs DB2 7.2 database.
After the 550-user point, DB2 performance dropped sharply, down to 200 Web pages per second. We were able to repeat these results, and, despite several rounds of e-mail trouble-shooting with IBM DB2 performance staff, we werent able to determine why we were seeing this behavior.
Using the code and configuration files we provided, IBM subsequently set up the benchmark test at its Toronto DB2 development lab to further explore the issue. IBMs testbed was somewhat different from oursit used a two-way database server with five disks rather than the four-way, 24-disk box we used. However, the basic architecturea load balanced BEA Systems Inc. WebLogic Server application server tier, the same DB2 version and setup, the same amount of server memory, and the same size of databasewas carefully duplicated, all using eWEEKs data set and exact code and configuration files.
IBM did not see the drop-off we did, even after several rounds of testing with different configurations to try to force the drop-off to occur (see DB2 performance chart).
Timothy Dyck is a Senior Analyst with eWEEK Labs. He has been testing and reviewing application server, database and middleware products and technologies for eWEEK since 1996. Prior to joining eWEEK, he worked at the LAN and WAN network operations center for a large telecommunications firm, in operating systems and development tools technical marketing for a large software company and in the IT department at a government agency. He has an honors bachelors degree of mathematics in computer science from the University of Waterloo in Waterloo, Ontario, Canada, and a masters of arts degree in journalism from the University of Western Ontario in London, Ontario, Canada.