Nvidia officials for five years have been promoting the use of their graphics technology to help speed up systems being used in high-performance computing environments.
At the 2013 International Supercomputing Conference (ISC) in Germany June 18, Nvidia outlined efforts that officials say not only validate what the company has done up to this point, but also will help drive further adoption of GPU accelerators in supercomputers.
Nvidia announced that version 5.5 of CUDA, its parallel computing platform, will support the ARM chip architecture. ARM officials, whose low-power chip designs can be found in the bulk of smartphones and tablets, are turning their focus to the data center.
ARM and its partners believe that the performance and power efficiency of their systems-on-a-chip (SoCs) can address the demands around low power and density coming from Web hosting and cloud computing environments, as well as from some high-performance computing (HPC) organizations. They expect demand for ARM SoCs for servers to take off next year, when ARM releases its ARMv8 designs, which includes such server features as 64-bit computing, more memory and greater virtualization support.
For Nvidia, having CUDA support ARM not only will make it easier for ARM processors to coexist in HPC systems with Nvidia’s GPU accelerators and for programmers to port applications from ARM to the GPUs, but it also means that HPC systems can be powered exclusively by technology from Nvidia, which licenses ARM’s SoC designs for its Tegra chips.
“We are seeing a lot of excitement around ARM, and [those organizations that are interested] are all waiting for ARM 64,” Roy Kim, marketing manager for Nvidia’s Tesla Group, told eWEEK.
Right now, the bulk of GPU accelerators are paired with x86 processors from Intel and Advanced Micro Devices. However, there already are a number of HPC applications porting to ARM and GPU architectures, including Amber, Gromacs and Hoomd-Blue, according to Kim. Organizations that tested the CUDA on ARM offering found that they needed few, if any, code changes to run the software on systems with ARM chips and Nvidia GPUs.
In addition, there already are chip makers—like Calxeda, Applied Micro and Marvell Technologies—already making ARM-based SoCs for servers. AMD officials also on June 18 unveiled details of its upcoming ARM-based “Seattle” SoCs for servers.
However, Intel isn’t backing down. The chip maker, which in November 2012 rolled out its Xeon Phi coprocessor technology, expanded the current portfolio of Xeon Phi offerings and talked at the ISC about its next-generation “Knights Landing” Xeon Phi offering that will come out next year and offer greater performance and power-consumption improvements. In addition, organizations will be able to use the 14-nanometer Knights Landing chips as either coprocessors—that essentially do the same job as GPU coprocessors—or as primary CPUs.
Nvidia Launches CUDA Support for ARM Server Chips
Coprocessors and GPU accelerators are gaining momentum in HPC. According to the Top500 list of the world’s fastest supercomputers released June 17, 54 systems used one or the other, with 39 choosing Nvidia GPUs and three choosing AMD’s ATI Radeon graphics products. Eleven used Intel’s Xeon Phi, including China’s Tianhe-2, the world’s fastest system.
IDC analysts, in a study released June 17, found that the number of HPC sites using coprocessors and accelerators doubled over the past two years, with Nvidia GPUs and Xeon Phi coprocessors in close competition.
In addition to the CUDA 5.5 announcement, Nvidia officials also noted that GPU accelerators are being used to develop neural networks, computing environments that operate in similar fashion to the human brain, including by adapting their work to what they learn in doing their jobs. Google created a neural network that used 16,000 CPUs in 1,000 servers to create 1.7 billion parameters—connections similar to those between neurons in the brain. By contrast, Nvidia and researchers at Stanford University’s Artificial Intelligence Lab created a network just as large with three servers using Nvidia GPU accelerators.
Using 16 servers with GPU accelerators, they created a 11.2 billion-parameter neural network—6.5 times larger than the Google one.
Nvidia also listed other artificial intelligence labs that use its GPU accelerators, and said Nuance for the past four years has used neural networks with GPU accelerators to enable its speech-recognition technology to handle issues like accents and background noises, Nvidia’s Kim said.
Big data is another area where systems with GPU accelerators are gaining interest because of the performance and energy efficiency, he said.
The GPU accelerator “business is going into areas it hasn’t gone into before, and going into markets it’s not familiar with, and that’s because of demand,” he said.