IBM engineers briefed attendees of the In-Stat Fall Microprocessor Forum here Wednesday on the technical details of the chip design and on how to optimize its performance for multiple applications.
Formally introduced in February at the International Solid State Circuits conference in San Francisco, the 64-bit Cell processor that will be used first in Sonys PlayStation 3 game console.
However, Chelmsford, Mass.-based Mercury Computer Systems Inc. has announced it will build Cell-based computers for use in the medical and aerospace industries.
IBM and Sony Computer Entertainment talked in May of last year about co-developing workstations built around the Cell; they said in November that this first Cell-based workstation had "powered on."
The main design goal for the Cell is to produce up to a tenfold increase in performance for most applications, said David Krolak, the lead IBM engineer on the Cell project.
The Cell is composed of a 64-bit PowerPC processor core called the PPE (Power Processing Element) running at 3GHz to 4GHz frequency surrounded by eight special-purpose SPE ("synergistic processing element") cores.
Krolak noted that the Cell could be configured for usage in game console systems, blades, HDTV sets, home media servers and supercomputers.
To connect the separate elements of the Cell, IBM designed the Element Interconnect Bus.
The EIB will be, Krolak said, a coherent SMP bus supporting 64 outstanding results per requestor and addressing collision detection and prevention.
Krolak said that this, plus the EIBs independent command and data networks and the ability to split command and data transactions, will allow a 3.2GHz Cell processor to reach a peak bandwidth of 300G bps, with a sustained rate of up to 200G bps.
This will provide "next-generation bandwidth," Krolak said.
He noted that the front side bus on most contemporary computers offer 6 to 8G bps, with DDR2 memory bandwidth at 6 to 11G bps.
In comparison, Krolak said, the Cell will use two Rambus I/O controllers and a Rambus Dual XDR memory controller for an aggregate memory data bandwidth of 25.6G Bps in each direction.
Krolak said that there were "potential bottlenecks" with this configuration, though. Developers will have to pay attention to the fact that the Cells data rings are a shared resource.
Multiple transactions can be on the same ring, Krolak said. But a transfer between two units will block access to that ring for other units on that path. As a result, he said, programmers will have to manage which devices talk to each other when they set up workload assignments.
Memory and I/O workload assignments will also have to be carefully managed, Krolak said, for maximum performance.