Development

Judgment Day

Written by

Published June 30, 2003

eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Can you tell whether Im a good columnist?

Does simply reading my columns give you enough information to make up your mind, based on your judgment of my skills?

Or would you feel you were on firmer ground if you could run benchmarks and performance tests that put the columns through simulated reader-response situations?

Dont laugh. This kind of product evaluation happens all the time for a wide variety of technology applications and hardware. And its not doing either the product vendors or IT buyers any good.

Youve read the claims. Product X outperforms competitors in tests! Benchmarks show product Y to be fastest! Thats nice. But does it really tell you anything useful about the quality of the product or what it will do for you?

Sure, performance tests can be a valuable tool when looking at some categories of products. But the problem is that they tend to drown out other more important measures. My rule of thumb is that benchmarks matter only when a product is performing significantly better—or worse—than its peers. If an application or system is performing within the limits of acceptable performance, then a little bit faster or slower doesnt matter.

The problem is largely one of perception. For some reason, people tend to see benchmarks and performance tests as “pure.” Unlike evaluations of feature quality, which are mainly based on the reviewers opinion and can be disputed easily, performance tests yield numerical results that seem to be free from bias.

Anyone who has done performance tests can tell you this isnt the case. Test scenarios that seem to be fair may actually play to the strengths of some products while overemphasizing the weaknesses of others. This is especially true of studies that are commissioned by vendors.

Several applications were developed to do well in standard performance tests, using configuration tweaks that dont translate to improved performance in real life. Recently, there has been controversy in the graphics card world over the claim that Nvidia configured its drivers to do better in the 3Dmark benchmark used for graphics card evaluation.

Machinations such as these dont do technology consumers any good because they render benchmarks meaningless. While vendors may like high scores for bragging rights, theyre not getting real benefit either. Too often, vendors spend precious time and energy to improve their test scores rather than enhance the features of their products, which would do them more good in the long run.

Of course, we do a lot of performance tests and benchmarking here at eWEEK Labs. But our tests are never the sole criteria for evaluation and are generally not considered the most important area of evaluation.

Unfortunately, no matter how little weight we give a performance test, it still seems to be the one thing that some readers focus on. Several times Ive done reviews where Ive listed performance test results near the end of a review, stating along with the results that all products performed well.

Inevitably, I get responses from readers taking my results to task, stating that theres no way product X could be faster than product Y and that I dont know how to properly configure product Y. Never mind that when it came to the actual features of the product, I found product Y to be superior; for these readers, performance results overshadow all else.

Im certain this kind of thinking goes on in the evaluations that companies carry out themselves when looking at products. When considering enterprise IT products, criteria such as ease of deployment, quality of features and whether the product is designed to maximize return on investment should be a lot more important than what product did best on tests that are widely removed from production environments.

So take any tests you didnt do yourself with a grain of salt. Make sure any tests you do yourself reflect your own usage scenarios, not some worst-case stress test that bears no relation to day-to-day use.

As far as the column benchmarks are concerned, Im cooking up one right now, and, as you might expect, my column is scoring extremely well on it.

Jim Rapoza can be contacted at [email protected].

Judgment Day

Get the Free Newsletter!

Get the Free Newsletter!

MOST POPULAR ARTICLES

9 Best AI 3D Generators You Need...

RingCentral Expands Its Collaboration Platform

8 Best AI Data Analytics Software &...

Zeus Kerravala on Networking: Multicloud, 5G, and...

Datadog President Amit Agarwal on Trends in...

Advertisers

Menu

Our Brands