Online exclusive (July 2003 version 1.1 update): Dig deeper into the eWEEK Labs/PC Labs database benchmark by downloading our version 1.1 database configuration and tuning scripts, JSP code, ASP.Net code and spreadsheets containing expanded benchmark results (582 KB .zip file). This file was updated in July 2003 to add a more detailed readme.txt, include PC Magazines ASP.Net code, add some optimizations to the code and include the originally missing MySQL configuration file. We are also making available the raw import data (981 MB .zip file) we used to construct our databases.
eWEEK and sister publication PC Magazine early last year published results of a comparative performance test of database servers. It had been almost a decade since either publication had performed a similar test, so we expected significant interest in the story.
The actual response, however, was stunning.
Almost a year and a half since it was published, the database server benchmark continues to be among our top 20 most popular Labs stories online. In addition, weve received several hundred e-mail messages commenting on various aspects of the story, requesting tuning advice, thanking us for the information, or requesting that the test be performed again using a larger or different set of products or testbed platforms.
We dont have plans to run the test again in the near future (the evaluation took several months to put together and was a thorny nest of complexity), but we can advance the story by putting into context later benchmark results from IBM and Microsoft Corp.
The differences between our tests and the results IBM and Microsoft saw in their own labs using our code demonstrate how myriad environmental and coding factors can affect test results. For example, the combination of a batch statement in Microsoft SQL Server (which has a side effect of requiring use of a client-side cursor) with bidirectional result set scrolling will slow performance considerably even though the functional result will still be correct.
Determining where these subtle performance problems hide is an important task for benchmarks. They are also important evaluation tools for locating performance bottlenecks, identifying problem-prone approaches and for capacity planning.
The best benchmark, of course, is always one based on an organizations own code and infrastructure. This information, combined with third-party benchmarks using a variety of workloads, provides the best insight into how a product will perform over time and under varying conditions.
DB2 7.2
The biggest unexplained mystery during our database server evaluation was the sharp drop-off we saw when testing IBMs DB2 7.2 database.
After the 550-user point, DB2 performance dropped sharply, down to 200 Web pages per second. We were able to repeat these results, and, despite several rounds of e-mail trouble-shooting with IBM DB2 performance staff, we werent able to determine why we were seeing this behavior.
Using the code and configuration files we provided, IBM subsequently set up the benchmark test at its Toronto DB2 development lab to further explore the issue. IBMs testbed was somewhat different from ours—it used a two-way database server with five disks rather than the four-way, 24-disk box we used. However, the basic architecture—a load balanced BEA Systems Inc. WebLogic Server application server tier, the same DB2 version and setup, the same amount of server memory, and the same size of database—was carefully duplicated, all using eWEEKs data set and exact code and configuration files.
IBM did not see the drop-off we did, even after several rounds of testing with different configurations to try to force the drop-off to occur (see DB2 performance chart).
Microsoft SQL Server 2000
Microsoft SQL Server 2000
While testing Microsoft SQL Server 2000, the issue became the JDBC (Java Database Connectivity) driver. We investigated this issue with the company that created the driver, DataDirect Technologies Inc., and with Microsoft itself, which did further testing in the months following our test. (The Microsoft SQL Server JDBC driver that Microsoft distributes is licensed from DataDirect.)
The key issue, according to DataDirect, was the use of a batch SQL statement (one containing more than one command) combined with a bidirectional cursor. In particular, one commonly used page issues a SQL command to retrieve a set of books in a catalog query. The application limits the number of books returned to 500 by placing “set rowcount 500” and “set rowcount 0” commands around the SQL select statement.
“This statement causes problems for the driver because the server will not give the driver a server-side scrollable cursor if the statement is a batch,” said Royce Willmschen, director of research and development at DataDirect, in Morrisville, N.C. “Therefore, we must emulate the scrollable cursor on the client side and process through the result set to find the last row.” Willmschen suggested using JDBC to programmatically set a limit on the number of rows returned rather than doing this in SQL directly.
Microsoft tested our application in its own lab and determined that replacing the “set rowcount” commands with an alternate syntax, “select top 500,” resulted in much-improved performance (see Microsoft SQL Server performance chart). This change folded the query limit constraint right into the select statement, eliminating the batch SQL statement.
As in the IBM DB2 case, this Microsoft test should not be directly compared with our numbers, but its clear the change did have a substantial positive impact on performance. Microsoft tested on a four-way server using slightly slower Intel Corp. Xeon CPUs than we used (550MHz versus 700MHz) and recorded a peak throughput of about 370 pages per second versus our peak throughput of 220 pages per second.
Microsoft didnt have exact performance figures for the application on its setup with our original “set rowcount” version, so we dont know precisely the performance delta this change introduced. However, the difference was substantial, according to Microsoft officials: “As I recall, it was over a doubling of performance with that change—it was a huge difference,” said Greg Leake, director of Microsoft .Net Competitive Labs, Developer Division, in Redmond, Wash.
One other important subtlety that Microsoft noticed and wed like to pass along was a difference between the JSP (JavaServer Pages) version and the ASP (Active Server Pages) .Net version of the test application we used. In the JSP version, we did catalog queries using an “=” exact-match operator. With the ASP.Net version, a “like” and “%” wild-card operator was used that places an extra CPU burden on Microsoft SQL Server.
We did not compare the JSP and ASP.Net numbers against each other, an inappropriate comparison in any case because of other necessary changes between the JSP and ASP.Net testbeds. However, the applications were intended to be as close to identical as possible, and both should have used the same kind of comparison operators. Updated code can be downloaded from PC Magazines own database benchmark report.
Timothy Dyck can be reached at timothy_dyck@dyck.org.