Ready or not, the age of so-called “tera-scale” computing is about to start.
On Feb. 9, at the ISSCC (Integrated Solid-State Circuits Conference) in San Francisco, Intel will offer the world a glimpse at the technology behind its tera-scale computing efforts, including the design of an 80-core processor.
“Tera-scale computing is coming,” said Jerry Bautista, director of Intels Tera-scale Computing Research Program. “What we are doing is bringing supercomputing-like capabilities to PCs, servers, and handheld and mobile devices.”
During the ISSCC show, Intel researchers are preparing to present nine different research papers, one of which will address the details of the companys efforts to develop a processor capable of delivering teraflops—trillions of calculation per second—of performance while consuming as few as 62 watts of power.
While Intel officials were quick to say that the 80-core chip is more a proof-of-concept design that will likely never come to market, it is possible that within 10 to 15 years certain elements and design specifications used to create this chip will be integrated with other processors.
Intel has already said that it would try to offer processors with 10 or more cores by the end of the decade. In November 2006, the Santa Clare, Calif., company began its march to offering more and more cores when it offered quad-core chips for servers and workstations that offered 1.5 times the performance of dual-core chips.
During February 2007, the company also announced that it had developed a method to achieve a 45-nanometer manufacturing process that will allow the placement of more transistors on each core to achieve greater performance, while at the same time scaling down the size of the chip.
Intel researchers predict that this type of multicore chip can find practical use in gaming, “virtual” travel and learning experiences, media management, and enterprise applications like data mining. However, several analysts remain skeptical about the lofty possibilities of multicore processing, since most software written today is not ready to take advantage of this type of multithreaded chip.
“Its tough enough to know what to do with four cores,” said Roger Kay, an analyst with Endpoint Technologies.
To counter this drawback to multicore processing, Intel said it is currently working on 100 different research projects to follow up its announcement, including the development of software—both internally and with outside ISVs—to develop applications that can take advantage of this type of processing power.
Unlike in the current crop of dual- and quad-core processors, the cores used to create the teraflop chips are much simpler, general-purpose Intel Architecture. This allows for some cores to perform highly specialized functions, such as boosting performance by running parallel jobs.
The chips will also have smaller and slower cores that will still allow top performance but will reduce the amount of power consumed by the processor.
To achieve this performance, Intel researchers designed the chips “tile” structure. This design allows these tiled cores to be replicated across the chip. In the case of the 80-core chip, 100 million transistors are stretched across 80 core tiles to achieve the design.
In the past, Intel researchers have said their findings show that eventually billions of transistors will be able to fit on a single piece of silicon. Referring to the 80-core chip, an Intel spokesperson later explained, “The researchers were bounded somewhat here by what production samples they could get out of the Fabs. For example, there was no special reason 80 was chosen: [It was] partly a measure of what was available.”
Next Page: Intel keeps up with Moores Law with 45-nanometer manufacturing.
Intel Keeps Up with
Moores Law with 45-Nanometer Manufacturing”>
This development is important since it coincided with Intels announcement that it had achieved a 45-nanometer manufacturing process that kept the company in line with Moores Law, which states that the number of transistors on a circuit board will double each year. The increasing number of transistors that can fit onto a piece of silicon allows a number of smaller cores to rest on the chip without increasing the overall area.
Each tile core is made of a compute element and a router. These routers, which control the flow of information to and from the core, contain five ports, which allow information to be shifted in and out at 80GB per second.
This grid design then creates a mesh that Intel calls “network-on-a-chip” architecture, which allows very high bandwidth communications between each core within the processor. This allows the cores to move terabits of information per second between them.
Finally, Intel researchers were also able to create a modular clocking scheme that allows elements within each of the individual tile cores—the FP engine, the data memory and the router—to power down and save on energy. The technology allows the cores, depending on the type of application the system is running, to wake or sleep on demand.
This will allow each chip to deliver performance of 16 gigaflops per watt.
The 80-core chip can run at 3.16GHz, while processing a little more than 1 teraflop of information and using only 62 watts. It would take less than 1 volt of electricity to power this chip. If the number of volts is increased, to 1.2 for example, the processors clock speed would increase to 5.1GHz and give the chip 1.63 teraflops of performance. The thermal envelope would increase to 175 watts.
The next part of the research process, according to Bautista, will be to develop three-dimensional stacked memory. The memory will be placed on top of the processor for heat reasons, according to Bautista.
For Kay, some of the questions left unanswered about the 80-core chip include what researchers intend to do in terms of providing cache memory and what will ultimately happen with the I/O.
“What you do have is a chip that shows what you can do when you manufacture at the 65-nanometer level,” Kay said.
Still, Bautista and other researchers said this type of processor and technology can find practical applications in the real world. For example, it would help give greater performance to legacy applications that have never run on multithreaded chips or taken advantage of parallel computing.
“Its something that can work reasonably well and we think that would be a pretty good approach,” Bautista said.
Check out eWEEK.coms for the latest news, views and analysis on servers, switches and networking protocols for the enterprise and small businesses.