Room at the bottom
The mantra of IT advancement is the 40-year-old empirical observation dubbed Moores Law, a remarkably accurate 1965 prediction by Intel co-founder Gordon Moore that the feasible density of packing electronic devices would follow an exponential trend. Often asked, though, is the question of whether that rate of progress can continue as hardware nears frontiers defined by the nature of matter and energy. "Whats got the semiconductor industry nervous is that theres been a slowing of the improvement rate in performance," said Thomas Theis, director of physical sciences at IBM Research, in Yorktown, N.Y. "The power dissipation, the heat from quantum mechanical tunneling eventsthe insulators are just over a nanometer thick in 90-nanometer technology. One nanometer doesnt make much of a barrier for a free electron."Mere continuation of the Moores Law trend to greater physical density of devices is not an attractive option, even if it were physically possible, Theis said. Like the density of devices, "factory costs have always been exponential," he added. "At some point, they only become affordable by nation-states or some kind of consortium of nation-states. Theres a cost to maintaining tolerances." Its therefore important, Theis continued, to ask more fundamental questions about the direction that future hardware design and fabrication should follow. "If you look at biological systems," he observed, "the amount of what is done with extremely high precision is small. Biology works just well enough; the system as a whole functions. The focus of our research is to deliver the information in such a way that the error rate is higher, but its good enough to make the process work." The Internet itself demonstrates this general approach: Its packet-based communication relies on connection and transfer protocols, such as TCP/IP and Ethernet, that are designed to function "well enough" rather than requiring perfection to function at all. IBM is only one of several research centers at which were finding a growing trend toward resilient and fault-tolerant protocols, even at the level of chip-to-chip connection and on long-distance links. Enterprise buyers should be increasingly prepared to discuss their tolerance for error rates and variabilities in system performance, rather than expecting single-valued measures of performance in system specifications. Not merely in metaphor but also in direct application, biology and biochemistry may contribute to the creation of IT hardware. "What were about to publish is a nanowire transistor technique that uses optical lithography to make a coarse template, then relies on polymer self-assembly to define the channels of the transistor," said Theis. Carving out the major pathways of a microchip by currently conventional techniques, the process that Theis described then lets the basic mechanisms of molecules finding and binding to each other complete the job with nanoscale precision. "We think these are relatively inexpensive processes," said Theis. His teams experimental prototype is "a very ugly device right now; its just a toy," he said. "But its a step in the right direction," he assured usa step, that is, toward continued progress in performance without an unacceptable explosion of manufacturing costs. Another trend that surfaced during our conversations at several research centers is the one toward more cost-effective processor architectures, following the mantra of "performance per watt" that Intel has recently adopted but that vendors including Advanced Micro Devices Inc. and Sun identified long ago as the future figure of merit for CPUs. The power that goes into a computer doesnt lift weights or pump water; it all turns, eventually, into heat, and the challenge of keeping densely packed server installations cool enough to run reliably is becoming a critical concern. Rather than seeking performance growth in ever-more-complex devices, therefore, "more system builders are moving to multicore architectures," said Sun Solaris Group Manager Chris Ratcliffe in Santa Clara, Calif. "The number of cores will expand rapidly. We have 32-way CPU systems in the worksit looks like a 1U [1.75- inch] rack system, but its immensely complex and can run hundreds of thousands of applications on that single system." In a number of conversations with eWEEK Labs, Sun Executive Vice President and Chief Technology Officer Greg Papadopoulos has painted just this picture of multiple cores rather than increased core complexity as the future of optimal processing performance. Sun has followed through, currently shipping to early-access customers eight-core, 32-thread CPUs that use far less power per unit of capacity than competing architectures do. The work described by Ratcliffe continues that trend. Value for money is also a major driver of IBMs innovative Blue Gene architecture, with a Blue Gene rack holding 2,400 processors now within one week of becoming available to users at the SDSC and its nationwide networked user community. That single rack, affording 5.7 teraflops of computing power with 512GB of memory, represents "unheard of" density, said SDSC Production Systems Division Director Richard Moore. "The paradigm shift is that IBM slowed down the processors to 700MHz," Moore explained, pointing out that the resulting reduced heat output is complemented by the innovative mechanical design of the machine, with its odd slanting sides that maximize cooling air flow across each horizontal subsystem circuit board within the cabinet . Inside the Blue Gene box, Moore added, are five independent high-speed networks that maximize overall system throughput. "Its an important architecture ... very cost-effective," Moore said. Researchers with networked access to the SDSC installation will likely apply the Blue Genes power to complex tasks such as simulations in chemistry and physics. Industrial and commercial users are also exploring similar approaches to failure-mode prediction in engineering projects and in pharmaceutical development. Less exoticbut equally focused on improved high- performance-computing valueare currently available tools for scavenging available CPU cycles across a network, such as the Condor system from the University of Wisconsin-Madison. Condor is now easing access to idle computing power at facilities such as MITs Laboratory for Nuclear Science, in Cambridge, Mass. Its also found at a growing number of commercial sites, with one deployment up and running for the last year on a 200-server grid at The Hartford Financial Services Group Inc., in Hartford, Conn. The normal turnover of PCs on desktops, explained MIT nuclear lab Associate Director Pat Dreher, can sometimes mean that the idle computers available during off-hours have more power than dedicated research clusters whose replacement or upgrade depends on scarce project funds. That perverse situation is one that tools such as Condor help to turn into a benefit. At the MIT installation, "when people arent working, [Condor] queries machines," said Dreher. "If they arent activeno ones logged in, theres no keyboard or mouse activitya job is put on that machine, and the results [are] written back to the central area." When a user returns to work, "the job is checkpointed out, [and] a snapshot is taken and stored on disk in a frozen state until it can get cycles to finish," he added. Next Page: Whats in store.
For memory devices, Theis said, there are many roads to explore. "Memory devices just need an easily distinguishable on and off state," he explained, "but every successful device thats been used for logic has amplification. It allows you to restore signals against a reference, the ground or the voltage supply, so small variations in one device after another dont drive the system out of spec."