How to Understand and Use Benchmarking

By Jennifer Sutherland  |  Posted 2008-04-14

How to Understand and Use Benchmarking

By Jennifer Sutherland and James Yaple

It's 3 a.m. and you just can't go to sleep. Your recommended server configuration is not providing the service levels that you promised. Customers are complaining about slow response times. Your manager wants to understand why you did not do your "due diligence."

Thank goodness, this example (this time) was only an imagined fiasco. But this scenario is certainly real all too often. So, what are the industry standards for predicting server performance without actually having the servers delivered and installed? Answers that start with "The vendor told me" or "I read on the Web" just aren't going to cut it. You need a proven, systematic approach that reduces risk and tilts the odds in favor of success-not just for you, but for your company.


What Does Benchmarking Do?

That's where benchmarking comes in. Benchmarking measures the actual performance of an IT configuration, be it a storage subsystem, CPU or application software. Contrasted with modeling (which attempts to predict a result), benchmarking is the validation of a performance model or hypothesis. Like the state of Missouri's motto, benchmark testing is the way to "show me" real results.  When managers are looking for actual numbers, benchmarking provides them.

Benchmarking, in general, establishes a point of comparison and is present in many daily activities. Benchmark comparisons can be subjective or objective, and qualitative or quantitative. Subjective and qualitative comparisons may generate statements such as "I don't like sushi" to "Vendor A's management software is easier to use." Objective and quantitative comparisons are more along the lines of "The average home in Madison, Wis., is 109 percent of the cost of a similar home in Austin, Texas.

Benchmarking Defined



What is IT Benchmarking?

IT benchmarking is the process of using standardized software, representing a known workload, to evaluate system performance. Benchmarks are designed to represent customer workloads, such as database or Web applications. They enable a variety of hardware and software configurations to be compared. Many benchmarks are integrated with cost components, so price and performance can be evaluated.

Performance benchmarks can be likened to government mileage estimates for automobiles. Actual performance in a customer environment with a customer workload will be different. Just because a particular database benchmark says a configuration can support 5,000 concurrent users or 8,000 transactions per second, does not mean that it is what a customer will experience with their own configuration. Some planners consider it a rule of thumb that actual results are unlikely to exceed published results. The major components of a benchmark are:

1) a workload, with associated metric(s)

2) a set of conditions, commonly called "run rules"

3) reporting requirements


Predict Performance with Benchmarking

For performance analysts and capacity planners, benchmarks can enhance the ability to estimate system hardware requirements and predict performance.  Commercial capacity planning software base the what-if analysis of performance scenarios on published benchmark results. 

The number of possible benchmarks is only limited by the imagination, but they fall into three general categories:

1) industry-standard

2) vendor-oriented

3) customer-sponsored or internal benchmarking


Industry-Standard Benchmarking

ISBs (industry-standard benchmarks) are developed, maintained and regulated by independent organizations. These benchmarks typically execute on a wide variety of hardware and software combinations. The most well-known ISB organizations are the SPEC (Standard Performance Evaluation Corporation) and the TPC (Transaction Processing Council).

Typically, hardware and software vendors are heavily represented in the membership of these organizations. The groups solicit input from members and the IT community when benchmarks are created and updated to reflect changes in the marketplace. Some common ISBs are:

  • TPC-C, representing a database transaction workload
  • SPEC jAppServer, representing a multitier, Java 2 Platform, Enterprise Edition application server workload
  • SPEC CPU2006, representing CPU-intensive workloads
  • SPC (Storage Performance Council), representing storage-intensive workloads


Often, benchmark organizations require license fees or membership dues to provide benchmark software. Corporate Data Center Operations  joined SPEC and SPC several years ago for access to the benchmark software.


Use Caution With Vendor-Oriented Tools



Vendor-Oriented Benchmarking

There are a number of vendor-oriented benchmarks that are application- or component-specific. This benchmark software can typically be obtained by customers for testing the performance of specific components. Caution must be taken, however, when using vendor-oriented tools, as bias in particular areas may be present. Plus, the ability to test across a spectrum of hardware and software configurations may not be available. Some examples of these types of benchmarks include EMC's iorate utility for testing EMC storage arrays, the Oracle Applications Standard Benchmark and Oracle's Orion.


Customer-Sponsored Benchmarking

Customer-sponsored benchmarking involves testing with a customer's workload or with an independent or vendor workload at the customer's site. This yields the most relevant information, as the most accurate method of determining how a configuration will perform under a particular workload is to test with that environment. This is typically done with tools from vendors such as HP/Mercury-Interactive or Rational, or with open-source tools such as The Grinder.

One benchmark or metric cannot be used for all systems or applications. A computer system is typically designed for one or more primary uses and may be incapable of performing other tasks. For example, the scientific community has evolved benchmarks that measure system performance on numeric computations that are not suitable for evaluating business applications or a database system.  Business and database software is typically dominated by the performance of software algorithms rather than by raw hardware speed. 

Benchmarking is a key step to understanding the trade-off between cost and level of service. It is a core competence of Computer Management Group members (note that the M stands for measurement).

 Benchmarking 101 was written by Jennifer Sutherland and James Yaple. Ms. Sutherland's research was done while she worked as a capacity planner for the Wisconsin Department of Health and Family Services. She can be reached at

 Mr. Yaple is the Chief Technology Officer for Corporate Data Center Operations (CDCO) of the U.S. Department of Veterans Affairs. He can be reached at

Benchmarking 101 is based on a paper written by a member of The Computer Measurement Group (CMG), a not-for-profit, worldwide organization of performance and capacity management professionals committed to ensuring the quality of IT service delivery to the business. These individuals publish and present more than 100 papers a year on this and similar topics-all devoted to measuring, analyzing and predicting computer performance. To read the complete paper, located in the CMG repository, go to Benchmarking 101.

Recently, papers from past conferences have been made available to the public and are available. Click here to read these papersThese include papers on platform and application measurement, and management from distributed systems to mainframes. This includes specific technologies such as server virtualization, Java application servers, emerging server and storage technologies, and operating systems that include zSeries, Unix, Linux and Windows.

The opinions and views expressed in this article are solely those of the reviewer/writer and do not necessarily represent the opinions and views of the U.S. Department of Veterans Affairs or of the State of Wisconsin.

Rocket Fuel