Intel Abuzz With Core-Mania

With its new circuitry, chip maker has potential to build chips with hundreds of cores.

Intel is about to deliver the opening salvo in a wave of multicore processors that could ultimately lead to chips with scores of cores.

The chip maker will begin the rollout of its Core Microarchitecture—new chip circuitry that emphasizes power efficiency—June 26 with the arrival of the dual-core "Woodcrest" Xeon 5100-series server chip. But Intel researchers said June 15 that they have already seen results with projects associated with their Tera-scale Computing effort to explore processors containing tens or even hundreds of cores.

Intel has already implied that it is aiming for processors with more than 10 processor cores by the end of the decade. However, Tera-scale chips would look and act differently. They would be built from numerous relatively simple general-purpose IA (Intel Architecture) x86 processor cores—with the potential to include specialized cores for some jobs—to boost performance by dividing up jobs and running them in parallel.

Tera-scale chips would use semiconductor design laws, which state that smaller, slower cores tend to use less power, to meet businesses needs for performance, while acknowledging concerns about matters like server power consumption.

"Theres this advantage to simplifying the individual [processor] core, accepting the reduction in single-thread performance, while positioning yourself—because of the power reduction—to put more cores on the die," said Intel Chief Technology Officer Justin Rattner in Hillsboro, Ore. "Thats the energy-efficiency proposition of Tera-scale. Less is more, actually, in the case of a Tera-scale machine because the underlying core efficiency is better than the cores weve been introducing this year."

Tera-scale chips would be particularly good for jobs requiring the processing of large amounts of data, such as computer visualization or using gestures to control a computer. But extracting the true performance potential of such a new approach wont be possible without improving chip technologies, including boosting on-board memory caches, creating high-speed interconnects for distributing data and building more efficient clock timing systems. Nor will it be successful without getting software developers, many of whom are just now starting to tackle the move from single-thread applications to multithreaded applications, on board, Intel executives said.

"Every time you increase the number of threads, youre putting greater burden on the programmers to write the applications ... to actually harness all that available parallelism," Rattner said.

The Tera-scale approach is a radical change from Intels Xeon 5100, which uses two complex processor cores. But one of the driving forces behind the Tera-scale research is the fact that chip transistor counts, already in the billions, will continue to double over time. Intel chips will approach 32 billion transistors by the end of the decade, researchers said.

Larger numbers of smaller transistors offer Intel the choice of packing numerous cores on a chip without increasing its area and affecting manufacturing yields.

Thus far, Intel and others have used the extra transistors to create more complex chips with larger on-board memory caches. Shifting toward many simple cores—trading two Woodcrest cores for tens of 386-style cores—would greatly increase a chips parallel processing abilities and thus offer more performance, according to analysts.

"The bigger question is, How do you take advantage of such a system?" said Dean McCarron, principal analyst with Mercury Research, in Cave Creek, Ariz. "Not everything lends itself to [many threads]. But, that said, everybody seems to be in agreement that this is the path were pretty much forced to go down."

Programming for Tera-scale chips will require a completely different approach that uses many different threads simultaneously. Its something only a few programmers currently are familiar with, said Steve Pawlowski, CTO for Intels Digital Enterprise Group, in Hillsboro.

So Intel is getting to work. In some cases, the company is working directly with large software makers. Elsewhere, its Software Products Group is offering tools to assist programmers with multithreading, said James Reinders, director of marketing and business development for the companys Developer Products Division, also in Hillsboro.

The tools, including compilers, performance libraries, tuners and thread checkers, aim to address such challenges as scalability, or how to make an application run faster on more than one core; correctness, or eliminating bugs; and ease of development.

"Were definitely seeing movement in attitudes of developers" toward multithreaded applications, Reinders said. "Over the next five years, I think well see most developers take an interest in understanding parallelism more."

At least one company, MainConcept, a maker of video codecs, has already adopted multicore, said CEO Markus Monig in Aachen, Germany.

MainConcept found that optimizing for dual-core chips using Intels tools gave it a performance edge. Codecs run about "1.8 times faster on dual-core machines because you can actually cut the picture into slices and feed them to the separate processors," Monig said. "For us, the shift to multicore development has been pretty dramatic."

He predicted others will follow suit. "Companies like us [that] are driven to be competitive ... will have to," Monig said. "If your codec isnt fast, nobody will buy it. There can be lots of benefit [for] applications which take a lot of CPU power."

Yet despite extensive backing inside Intels Corporate Technology Group—about 80 projects and 40 percent of the groups researchers are involved in Tera-scale research in some manner—and the existence of several niche markets that could take advantage of such technology today, Tera-scale may never fully come to light. After all, the research must be adopted by Intels product groups before it can come to market. Those groups might decide to keep going with current plans or use some other approach to many-core architectures.

However, "itll definitely affect the way products look later in the decade and early into the next," Rattner said.