Nvidia officials are adding to the company’s portfolio of graphics processors aimed at emerging markets like deep learning, artificial intelligence and computer vision with a new version of its powerful Tesla P100 GPU.
At Nvidia’s GPU Technology Conference in April, CEO Jen-Hsun Huang introduced the Tesla P100, a GPU for data centers built on the company’s Pascal architecture and a 16-nanometer FinFET manufacturing process and aimed at high-performance computing (HPC) environments to address new workloads that require high levels of parallel processing. The first version announced at the Nvidia conference was for the new NVLink interconnect technology.
At the ISC High Performance 2016 show this week in Frankfurt, Germany, Nvidia officials unveiled the P100 GPU accelerator for PCIe, an interconnect technology common on most servers. The new chip, which will be available in the fourth quarter, delivers 4.7 teraflops of double-precision performance and 9.3 teraflops of single-precision performance, according to the company. It also provides 18.7 teraflops of half-precision performance with Nvidia’s GPU Boost technology.
It will come in two versions—one with 16GB of High-Bandwidth Memory (HBM2) and 720GB/second of memory bandwidth, and the other with 12GB HBM2 and 540GB/second of memory bandwidth. Nvidia said system OEMs like Hewlett Packard Enterprise, Dell, Cray, IBM and SGI are working on systems that will incorporate the P100 for PCIe.
The move to support PCIe is important to making supercomputing capabilities available to more scientists and researchers, according to company officials. Most systems include a PCIe slot, while NVLink, which is faster than PCIe, is less widely available. Nvidia estimates that two out of every three scientists don’t have access to the compute cycles they need on HPC systems to do their work.
GPUs are increasingly being used to help accelerate workloads on HPC systems without ramping up the power consumption too much. According to the latest Top500 list of the world’s fastest supercomputers released June 20 at the ISC High Performance show, 93 of all systems use accelerators of some kind, with most—67—using Nvidia GPUs.
“Accelerated computing is the only path forward to keep up with researchers’ insatiable demand for HPC and AI (artificial intelligence) supercomputing,” Ian Buck, vice president of accelerated computing at Nvidia, said in a statement. “Deploying CPU-only systems to meet this demand would require large numbers of commodity compute nodes, leading to substantially increased costs without proportional performance gains.”
The Tesla GPU for PCIe accelerators enable the creation of what Nvidia officials call “super nodes” that each provide the throughput of more than 32 CPU-based nodes with up to 70 percent lower capital and operational costs. When running the Amber molecular dynamics code, a server powered by a single Tesla P100 delivers more performance than a 50 CPU-only node, they said.
Nvidia for more than five years has been developing technologies for such emerging markets as deep learning and AI, which can take advantage of the parallel processing capabilities of GPUs. The introduction of the Tesla P100 in April was a key step forward for the company. Also at the show in April, Nvidia announced the DGX-1, which officials called the world’s first supercomputer for deep learning and AI. It combines eight Tesla P100 GPUs with two Intel Xeon server chips to drive 170 teraflops of performance in a 3U (5.25-inch) form factor.
“AI is becoming the next big thing in supercomputing,” Marc Hamilton, vice president of solutions architecture and engineering at Nvidia, told eWEEK.
Nvidia Brings Tesla P100 GPU Acceleration to PCIe Servers
The push to expand GPU computing in HPC and the enterprise is working. Nvidia in May announced that its data center business in the first quarter grew 63 percent over the same period in 2015, to $143 million, due in large part to the increasing demand in HPC for deep learning, in which computers can be trained to learn based on experience, much like humans do.
“One of the most important areas of high performance computing has been this area called deep learning,” Huang said during a conference call in May about the financial numbers, according to a transcript on Seeking Alpha. “Deep learning is a very important field of machine learning, and machine learning is now in the process of revolutionizing artificial intelligence, making machines more and more intelligent and using it to discover insight that, quite frankly, isn’t possible otherwise.”
Also at ISC 16, Nvidia officials introduced upgrades to the vendor’s deep learning software. The company offers DIGITS—Deep Learning GPU Training System—to help users design, train and validate deep neural networks, and with DIGITS 4, a new object detection workflow enables scientists to train these networks to find such objects as faces, pedestrians, traffic signs and vehicles from among many other images. This is important for everything from tracking objects from satellites to driver assistance systems. DIGITS 4 release candidate will be available this week from the Nvidia developer program.
Version 5.1 of Nvidia’s cuDNN, also available immediately, delivers accelerated training of deep neural networks, while the GPU Inference Engine (GIE) optimizes trained deep neural networks for efficient runtime performance.