Intel describes aspects of its Larrabee microarchitecture, including the design of an x86 processing core developed specifically for the chip. The chip maker explains why its engineers believe the Larrabee processor will usher in a new era of parallel software programming.
Intel is offering the first in-depth look at its "Larrabee" processor and the
chip maker plans to offer the microprocessor to address a range of graphics and
visual applications using x86 processing cores instead of more traditional GPUs.
In a paper, "Larrabee: A Many-Core x86 Architecture for Visual
Computing," Intel engineers offered several new details about the
forthcoming Larrabee graphics processing unit, including the fact that Intel
derived the instructional pipeline for the individual x86 cores from the
company's Pentium chip.
In addition, Larrabee will support Microsoft's DirectX and OpenGL APIs,
which Intel hopes will motivate a legion of software developers to create new
visual- and graphics-intensive applications while taking advantage of the
traditional Intel Architecture found in Larrabee's x86 cores.
The first of the Larrabee chips, which are destined for the high-end PCs
that use discrete graphics cards, will not arrive until 2009 or 2010,
although Intel is expected to release samples starting in late 2008. Larrabee
is described as a "many-core" processor, which means that it's likely
to contain 10 or more individual x86 CPU cores within the silicon package. (Intel's
upcoming Nehalem processors
are likely to have up to eight cores.)
While Intel engineers have spoken about Larrabee and its place within
high-performance computing, the paper makes clear that the first of the
Larrabee processors are designed for the gaming market, where the chip will
compete against high-end GPU offerings from ATI-owned by Advanced
Nvidia. The fact that Intel is supporting the industry-standard DirectX and
OpenGL APIs shows that the chip maker is looking to encourage developers to
create new gaming applications on its architecture.
Intel is also betting that Larrabee will usher in a new era of parallel
computing by offering developers a way to create highly specialized
applications, such as games that require visual computing or scientific
software applications that require intensive graphics capabilities, using the
familiar x86 instructional set along with the C and C++ programming languages.
with its Tesla 10 series GPGPU (general processing GPU),
developers to learn a new programming language called CUDA (Compute Unified
Device Architecture), which allows the GPU to be programmed like a CPU.
For its part, AMD
and its ATI graphics division are embracing CL, an open-source programming
AMD is also moving toward combining the CPU and GPU on the same
piece of silicon as part of its Accelerated Computing program.
In short, Intel is looking to combine the throughput capabilities of a CPU
with the parallel programming abilities found in graphics processors.
"What the graphics and general data parallel application market needs
is an architecture that provides the full programming abilities of a CPU, the
full capabilities of a CPU together with the parallelism that is inherent in
graphics processors," said Larry Seiler, a senior principal engineer with
Intel. "Larrabee provides [that] and it's a practical solution to the
limitations of current graphics processors."
This development could lead to a new way of looking at the capabilities of
CPUs and GPUs in the commercial market.
"What stands out is that Intel views the CPU as the best GPU,"
said John Spooner, an analyst with Technology Business Research.
"Intel is able to apply x86 to rendering graphics rather than adopting
a new or different architecture, which is clearly directly opposite of Nvidia's
view of the world," Spooner added. "These companies are sure to engage in a public jousting match over whose architecture
is better. The one that comes out on top, though, will be determined by
performance and how well accepted the architecture is by developers."
At the heart of Larrabee is a series of simple x86 cores that are built with
short instructional pipelines derived from the Pentium chip. The chip will also
include what Intel describes as a vector processing units, which enhance the
performance of graphics and video applications.
The Larrabee architecture will support four execution threads with each core
and each thread supporting a register set, which helps with memory. In this setup,
Larrabee offers a simple, efficient in-order instructional pipeline but
maintains some of the benefits of an out-of-order pipeline, which helps when
running applications designed to run in parallel. The short pipelines on
Larrabee will allow for faster access to the Level 1 cache with each core.
All the Larrabee x86 cores-at this point Intel gave no guidance as to how many cores
Larrabee will use-will share part of a large L2 cache, which will be partitioned among the
different cores and allow for high bandwidth and data sharing.
The entire Larrabee chip architecture will be built on what Intel called a
"bidirectional ring network," which should also allow faster
communication between each of the individual x86 cores.
Intel will present the entire technical paper at the
SIGGRAPH conference in Los