There’s little debate that graphics processor unit manufacturer NVIDIA is the de facto standard when it comes to providing silicon to power machine learning (ML) and artificial intelligence (AI) based systems. As important as Intel was to general-purpose computing, NVIDIA is the same to accelerated computing. Its GPUs can be found in everything from big-data center systems to automobiles to desktop video devices--even consumer endpoints.
NVIDIA is best known for GPUs but also makes systems
An emerging part of NVIDIA’s business is the systems group, where it makes full-functioning, turnkey servers and desktop PCs for accelerated computing. An example of this is the NVIDIA DGX Server line which is a set of engineered systems specifically built for the rigors of AI/ML. This week at the digital Supercomputing show, NVIDIA announced the latest member of its DGX family with the DGX A100 Station.
This “workstation” is a beast of a computer and features four of the recently announced A100 GPUs. These GPUs were designed for data centers and come with either 40 GB or 80 GB of GPU memory, giving the workstation up to 320 GB of GPU memory for data scientists to infer, learn and analyze with. DGX A100 Station has a whopping 2.5 petaflops of AI performance and features NVIDIA’s NVLink as the high-performance backbone to connect the GPUs with no inter-chip latency creating effectively one, massive GPU.
MIG enable workgroups to leverage a single system
I put the term “workstation” in quotes because it’s really a workstation in form factor only; even at 2.5 FLOPS compared to the 5 that the A100 Server has, it's still a beast of a machine. The benefit of the DGX Station is that it brings AI/ML out of the data center and allows workgroups to plug it in and run it anywhere. The workstation is the only workgroup server I’m aware of that supports NVIDIA’s Multi-Instance GPU (MIG) technology. With MIG, the GPUs on the A100 can be virtualized so a single workstation can provide 28 GPU instances to run parallel jobs and support multiple users, without impacting system performance.
As mentioned previously, the workstation form factor makes the A100 Station ideal for workgroups and can be procured directly by the lines of business. Juxtapose this with the A100 Server, which is deployed into a data center and typically purchased and managed by the IT organization. Most line-of-business individuals, such as data scientists, don’t have the technical acumen or even the data center access to purchase a server, rack and stack it, connect it to the network and do the IT things that need to be done to keep it running.
A100 Station is designed for simplicity
The A100 Station looks like a big computer. It sits upright on or under a desk and simply requires the user to plug the power cord and network in. The simple design makes it perfect for agile data science teams who work in a lab, a traditional office or even at home. DGX Station was designed for simplicity and does not require any IT support or advanced technical skills. My first job out of college was working with a group of data scientists as an IT person, and I can attest to the importance of simplicity with that audience.
Without something like A100 that was purpose-built for accelerated computing, workgroups would be forced to purchase CPU-based desktop servers which are severely underpowered for this kind of use case. Sure, the average Intel-based workgroup server can run Word and Google Docs, but it can take months to run AI-based analytic models? With the GPU-powered systems, what took months can typically be done in just a few hours or even minutes.
Although NVIDIA didn't announce a price for the DGX A100 Station, I'm guessing it's approaching six figures and that might seem high for a workstation. But considering the compensation level of data scientists, keeping them working versus sitting around waiting for models to run on CPU systems, that cost is a bargain. If one factors in the lost opportunity costs of not having an AI/ML optimized system, it makes the Station a no-brainer for workgroups that need this kind of compute power.
Some companies might turn all AI infrastructure over to the IT organization, and that’s a perfectly fine model. Those companies likely will leverage one of the server form factors.
For those who leave the infrastructure decisions and purchasing within the lines of business, the DGX A100 Station is ideally suited. GPU power at the desk might seem a bit sci-fi-ish, but NVIDIA announced it this week.
Zeus Kerravala is an eWEEK regular contributor and the founder and principal analyst with ZK Research. He spent 10 years at Yankee Group and prior to that held a number of corporate IT positions.