Theres a reason why enterprise storage investments are among the least-soft sectors of IT spending. Its almost impossible for any organization to know too much.
Even in industry sectors that dont seem strongly data-driven, its knowledge that creates competitive advantage. The strategic assets of a company such as Exxon Mobil Corp. arent so much the oil in the ground as they are the costly logging of data from test wells, identifying where production drilling will pay. The critical assets of an e-business are the records that offer insight into how a Web site visitor is transformed into a buyer—and a first-time buyer developed into a lucrative long-term customer.
Storage must be measured not just in gross terms of capacity but also in net terms of useful access vs. cost. Capacity measures how much can be known, but speed measures how much that knowledge is worth; density measures the overhead of knowledge upkeep; longevity of various storage media determines how long that knowledge will yield returns.
Capacity alone is hard pressed to keep up with the flood tide of new bits. An information-rich economy generates ever more data per person—current estimates are 250MB per person per year, doubling every year.
But mere quantity measures fail to reflect the increasing complexity of the tasks that storage systems are expected to perform. Raw data retrieval times continue to improve at impressive rates, but perhaps not quickly enough to offset more elaborate queries that involve a greater number of data relations.
The greatest distortion in simple measurements of storage capacity growth is in their failure to reflect the growing cost of duplicates. For every e-mail sent, every Usenet message posted, every new Windows device driver written and distributed, dozens or thousands or millions of copies soon occupy space along chains of servers or on uncountable separate workstations.
Its fortunate that magnetic hard disks have surpassed most 20th century projections as to their physical limits. But those limits likely do exist. Density growth may continue at present rates for another four years or so, but IBM researchers expect to encounter by 2005 a limit at 40G bits per square inch.
What next? Optical media, able to focus at varying depths on transparent/fluorescent layers, could enable parallel access to multiple data streams while storing a terabyte of data on a successor to the DVD. Holographic methods enable data retrieval based on content patterns rather than physical storage locations, potentially reducing the capacity and processing overheads of creating and maintaining database indices. Atomic Force Microscope technologies store hundreds of G bits per square inch with nanoscale mechanical media and probes (though data transfer speeds are as yet far from competitive).
The need for ever more storage will surely continue to drive invention.