More than a decade ago, Nvidia executives championed general-purpose GPUs as accelerators for high-performance computing workloads. They would complement processors to enable servers to accelerate application performance while keeping a lid on power consumption. Now, at a time when modern workloads are demanding even more compute power, the company is continuing to position GPUs as the primary computing engine.
At Nvidia’s GTC show in Taiwan, CEO Jensen Huang unveiled the HGX-2, a server platform designed to address the needs of both high-performance computing (HPC) and artificial intelligence (AI) workloads. The platform includes 16 of Nvidia’s Tesla V100 Tensor Core GPUs that are linked via the company’s NVSwitch interconnect fabric, which essentially enables the 16 graphics engines to work as a single GPU. The first system to leverage the HGX-2 platform was Nvidia’s upcoming DGX-2, which delivers up to 2 petaflops of performance and 512GB of HBM2 high-bandwidth memory, according to company officials.
The HGX-2 platform comes at a time of change in data centers, driven in large part by the rise of modern workloads that increasingly rely on AI and machine learning as well as the rise of cloud computing, analytics and mobility. Intel is the dominant processor vendor with more than 90 percent of the market, but a growing demand for an alternative is driving not only Nvidia’s efforts but those of Advanced Micro Devices and such Arm-based manufacturers as Cavium, which earlier this much introduced its ThunderX2 server chip.
According to Nvidia officials, the HGX-2 enables high-precision calculations using FP64 and FP32 floating point formats for HPC computing and simulations. At the same time, it also enables FP16 and Int8, aimed at AI training and inference. With such capabilities, the same platform can be used for HPC workloads as well as AI needs.
“As long as there is data, so long as there is knowledge in how to create the architecture, we can create absolute enormous software,” Huang said during his keynote address as the GTC show, according to Nvidia. “And every single company in the world that develops software will need an AI supercomputer.”
Nvidia officials describe the HGX-2 as a platform for cloud servers run by such hyperscalers as Amazon Web Services, Microsoft and Facebook. Such companies were quick to adopt the HGX-1 platform introduced by the GPU maker at the Computex show last year. HGX-1 offered up to eight GPUs, half of what the HGX-2 can support. Using NVSwitch interconnect to tie together up to 16 Tesla V100 GPUs enables them to become what Huang called “the world’s largest GPU.”
“Every one of the GPUs can talk to every one of the GPUs simultaneously at a bandwidth of 300 GB/s, 10 times [the speed of] PCI Express,” the CEO said. “So everyone can talk to each other all at the same time.”
Nvidia officials also are positioning the HGX-2 as a building block that system makers can use to create servers designed for specific tasks. Four server makers—Lenovo, Supermicro, QTC and Wiwyinn—said they plan to launch systems this year that include the Nvidia platform. Four original design manufacturers (ODMs)—Foxconn, Quanta, Wistron and Inventc—also are designing systems based on HGX-2 that will be used in large cloud data centers.
HGX-2 also is part of Nvidia’s lineup of GPU-accelerated server platforms that offer a mix of GPUs with Intel Xeon server CPUs and interconnects to address a range of HPC, AI and accelerated computing workloads. According to Nvidia, HGX-I2 systems are for AI inference workloads, HGX-T2 for training, and SCX for supercomputing. Huang has made AI and machine learning workloads a key focus for Nvidia, noting that the parallel computing capabilities of GPUs fit well with the needs of AI training applications and can also handle AI inference.
However, other chip makers are taking a hard run at AI as well. Intel two years ago bought Nervana Systems to bolster its AI capabilities and is working on what the company is calling a Neural Network Processor. In addition, Intel and Xilinx also are pushing their respective field-programmable gate arrays (FPGAs) for AI workloads, AMD is promoting its x86-based Epyc chips and Arm-based chip makers like Qualcomm are putting a focus on AI workloads.
Use of accelerators have become commonplace in both HPC and enterprise computing. According to the Top500 list of the world’s fastest supercomputers released in November 2017, 102 of the systems used GPU accelerators from Nvidia or AMD or coprocessors from Intel.