MTBF Reality

By David Morgenstern  |  Posted 2007-02-28 Print this article Print

"People want to condense [MTBF] to a single sound byte. They dumb it down and lose the essence of it," said Ed Tierney, director of marketing for storage vendor ATTO Technology, of Amherst, N.Y. The company examines the results of backroom testing as well as the rates from products in the field. While hard disks are a much different product than the HBAs (host bus adapters) ATTO makes, the company has examined a large sample of drives. Tierney said the statistical failure rates were close to the field rates.
Should hard drives be outfitted with flash memory? Microsoft says its a good thing for Windows Vista performance. Click here to read more about Vistas ReadyBoost and Intels Robson flash technologies.
According to storage industry analyst Jim Porter of Mountain View, Calif.-based Disk/Trend, there isnt any reliable way to statistically review the reliability of disk drives, as used in the field. "There are just too many different kinds of usage sites, too many variations in management skills, and a variety of disk drive types," he said. What seems clear is that theres a gap between the reliability expectations of manufacturers and customers. The current MTBF model isnt accounting accurately for how drives are handled in the field and how they function inside systems. Problems with handling drives can come anywhere along the supply chain. I spoke with an analyst a number of years ago who was standing on the docks in Malaysia while visiting the fab operation of a major disk manufacturer. He watched the shipping containers filled with drives being loaded into the boat. Suddenly, he said, a chain broke and the container fell many stories onto the concrete pier. The chain was refastened and raised up again. Those drives found their way into servers and after-market systems. And we wonder that some series of hard disks get a reputation for problems? Maybe some of those drives have been treated to a G-shock test not on the record books. But I see a trend toward a cavalier attitude toward the handling of drives in the field, especially 3.5-inch mechanisms. Perhaps people have grown used to the handling of 2.5-inch notebook drives that are designed to take a bit more of a beating than their larger cousins. Porter said that its hard to draw conclusions about MTBF even from a large sample of drives, since they are basically all individual experiences. Remember, he said, that the industry shipped more than 400 million hard disk drives in 2006. "Some disk drives will fail, because its expected within the reliability specs were discussing. Thats why RAID versions of storage systems were developed, because the failure of two drives within the same storage system at the same time is extremely rare," he said. Do enterprise clients really need bigger and bigger hard disks? Maybe not. Click here to read more. Yet, the distance between "rare" and "impossible" seems to have been bridged in the minds of customers. Inflated or not, when failure rates are counted in years, then the worry is pushed out of mind and into another years budget. Its easy to think that a failed drive will always be found in someone elses server with someone elses data. No doubt the inflated MTBF stats on the spec sheets have helped that misunderstanding along. Even if MTBF were a reliable predictor for some perfect hard drive on the testing bench, drives in the field will fail and fail regularly. Worse, theres now the expectation that all data will live forever; that no data will be lost. Come on—that isnt reality. Heres a slice of this reality disjunction. Vendors tell us that a RAID 6 array can have a "mean time before data loss" of some 86,695 years. Yet at several conferences Ive attended in the past year, someone predicted that somewhere soon a RAID Level 6 array will fail. Thats with double the redundancy. Certainly, the lesson from the FAST research is that IT budgets must include a line item for regular replacement of hard disks, even if the MTBF says it isnt necessary. This may cut into the expenditures of new storage systems, something that CIOs and storage vendors prefer. I recall a bit of discussion about MTBF at a meeting of the San Francisco SNUG (storage networking user group) last summer. One reseller in the group asked a storage vendor to "quit publishing this crap." That recommendation will be a tough one for the marketing department to execute. In the meantime, we can all start by taking a more realistic attitude toward MTBF. What do you think? Is MTBF an outrage? Or did you see through it all along? Let us know here. Check out eWEEK.coms for the latest news, reviews and analysis on enterprise and small business storage hardware and software.

David Morgenstern is Executive Editor/Special Projects of eWEEK. Previously, he served as the news editor of Ziff Davis Internet and editor for Ziff Davis' Storage Supersite.

In 'the days,' he was an award-winning editor with the heralded MacWEEK newsweekly as well as eMediaweekly, a trade publication for managers of professional digital content creation.

David has also worked on the vendor side of the industry, including companies offering professional displays and color-calibration technology, and Internet video.

He can be reached here.


Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel