AMD Developing 32-Core Zen APU for Supercomputers

The chip maker unveiled some details of its upcoming Exascale Heterogeneous Processor in a paper submitted to the IEEE.

AMD logo

Advanced Micro Devices reportedly is developing a high-end chip aimed at supercomputers that will include as many as 32 "Zen" processing cores and an unknown number of its upcoming "Greenland" GPUs.

Some of the details of AMD's "Exascale Heterogeneous Processor" (EHP) were outlined in a paper submitted to IEEE (Institute of Electrical and Electronics Engineers) and first published in the Bits and Chips news site. According to the site, the new accelerated processing unit (APU) could hit the market between 2016 and 2017.

The new EHP would fit in with AMD's larger plans to make an aggressive push back into the data center. At a meeting with financial analysts and journalists in May in New York City, President and CEO Lisa Su and other executives said the recharged effort in the data center would include not only servers, but also workstations, networking gear and storage products. Su said the data center "is probably the biggest single bet we're making today."

However, the executives said the growing demand for more choice in the x86 server market was due to the need for more competition and to improve the economics and innovation in the space. AMD wants to provide that alternative to Intel, according to Forrest Norrod, senior vice president and general manager of the company's Enterprise, Embedded and Semi-Custom Unit.

Not surprisingly, AMD is taking a heterogeneous approach to the idea of a chip for exascale computing. Since buying graphics maker ATI in 2006, the vendor has been at the forefront of integrating its CPUs and GPUs on the same piece of silicon—creating its APUs—and along with Nvidia has been aggressive in pushing GPU accelerators to help organizations in the high-performance computing (HPC) space speed up the performance of their supercomputers while holding down power consumption. The GPUs can more efficiently run the highly parallel workloads that are becoming more commonplace in HPC.

AMD also has been the driver behind the Heterogeneous System Architecture (HSA), which is aimed at creating processor designs that enable the CPUs and GPUs to work together more seamlessly. Software developers don't have to program to one or the other; the system will decide which will run the workloads.

In the latest Top500 list of the world's fastest supercomputers—released in July—90 of the systems use GPU accelerators from AMD or Nvidia or Intel's x86 Xeon Phi co-processors, up from 75 in the list released in November 2014. Four of the top 10 use accelerators.

AMD's exascale chip will leverage new products and technologies outlined by company executives in the meeting with financial analysts in May. They will include the Zen CPU core design, an architecture two years in the making that will include simultaneous multi-threading (SMT)—a technology similar to Intel's Hyper-Threading—for improved performance, support for DDR4 memory and a FinFET transistor design. Zen will offer significant performance and power-energy improvements over current APUs, officials said. Zen will appear in an array of chips for everything from high-end PCs to servers.

The EHP also will use AMD's High-Bandwidth Memory (HBM) technology (up to 32GB of HBM2) and an interconnect—AMD's Coherent Fabric—to facilitate communication between the CPUs and GPUs. An interposer will connect the memory stacks to the CPUs and GPUs. It reportedly will deliver 10 teraflops of throughput, with 100,000 linked together to offer exascale computing.

The Obama administration last week issued an executive order creating the National Strategic Computing Initiative (NSCI), which includes accelerating the development of an exascale system in the United States as one of its key goals. The government is looking for a system that can deliver 100 times the performance of current 10-petaflop systems across a range of applications, but also—over the next 15 years—creating the technology needed for future HPC systems after current semiconductor technologies hit their limits and Moore's Law is exhausted.