Nvidia Unveils GPU Accelerators for Hyperscale Data Centers
Included in the Tesla Hyperscale Accelerator line is the Tesla M40 GPU, which is optimized for machine learning and reduces training time for systems by eight times compared with systems running only CPUs. A typical AlexNet training process takes 10 days for a CPU-only system, but 1.2 days for one that is accelerated, Buck said. It offers scale-out performance through its support of Nvidia's GPUDirect technology that helps speed up multi-node neural network training. The Tesla M40 offers 3,072 cores, 12GB of GDDR5 memory and 288 Gb/s bandwidth, all in a 250 watt power envelope. It offers a peak performance of 7 teraflops. The Tesla M4 is a low-power GPU built for hyperscale environments that helps run the trained models in the data center and is optimized for Web service applications, such as video trans-coding, image and video processing and machine learning inference. It can trans-code, enhance and analyze up to five times more simultaneous video streams than CPUs, consumes 50 to 75 watts of power, and offers up to 10 times the power efficiency of a CPU for video processing and machine learning algorithms, according to Nvidia officials. Its small size fits into the enclosure designs for hyperscale data center systems. The Tesla M4 holds 1,024 cores, 4GB of GDDR5 memory and 88 Gb/s bandwidth, with a peak performance of 2.2 TFLOPs. The Nvidia Hyperscale Suite of software tools include cuDNN, a popular algorithm software for processing deep neural networks used for artificial intelligence applications, GPU-accelerated FFmpeg multimedia software to accelerate video trans-coding and processing, GPU REST Engine to easily create and deploy high-throughput, low-latency Web services, and Image Compute Engine service with REST APIs that enables image re-sizing five times faster than a CPU.
The Tesla M40 and the software suite will be available later this year, with the Tesla M4 coming in the first quarter 2016.