How NVIDIA EGX Accelerates AI at the Edge

TREND ANALYSIS: The model of the engineered system is that it’s a full turnkey solution that offers all the necessary hardware and software required to perform a specific task. NVIDIA's is literally about plug-and-play AI.


This week at Computex 2019 in Taipei, GPU market leader NVIDIA announced its new EGX server, an engineered system that brings high performance, low latency AI to the edge. The concept of EGX is similar to NVIDIA’s DGX, which is an engineered system specifically designed for data science teams (hence DGX, where Edge = EGX).

The model of the engineered system is that it’s a full turnkey solution that offers all the necessary hardware and software required to perform that specific task. It’s literally plug-and-play AI.

Engineered systems accelerate deployment time

I’ve spoken to DGX customers who have told me that the turnkey nature of DGX enables them to speed up the process of deploying, tweaking and tuning the infrastructure required for data sciences from several weeks or even months to a single day. I expect EGX to have a similar but greater value proposition.

DGX is typically deployed in places where data scientists and IT pros have access to it. EGX is designed for edge locations like 5G base stations, warehouses, factory floors, oil fields and other places where there aren’t people that have physical access to the server. Having the right infrastructure in place day one is critical to success

EGX software is optimized for AI edge inferencing

The NVIDA EGX Edge Stack is optimized for the rigors of AI inferencing. Models can be trained anywhere, but they run on EGX to interpret data. The software stack includes NVIDIA drivers, a CUDA Kubernetes plug-in, a CUDA Docker container runtime environment, CUDA-X libraries, containerized AI specific frameworks such as NVIDIA TensorRT, TensorRT Inference Server and DeepStream. EGX is a reference architecture in which the software stack is loaded on to one many certified server partners.

At launch, the following server manufacturers—in fact, all the major makers—have announced support for EGX: Acer, ASRack, ASUS, Atos, Cisco Systems, Dell-EMC, Fujitsu, Gigabyte, HPE, Inspur, Lenovo, QCT, Sugon, SuperMicro, Tyan and Wiwynn.

Edge microserver ecosystem partners:Abaco, Adlink, Advantech, AverMedia, Cloudian, ConnectTech, Curtiss-Wright, Leetop,, Musashi and WiBase.

There are also dozens of ISVs that have leveraged EGX for specialized use cases. These include AnyVision, DeepVision, IronYun and Malong. There are also a number of health-care specific offerings from 12 Sigma, Infervision, Qunatib and Subtle Medical.

NVIDIA EGX is designed to be scalable as customers can start with an NVIDIA Jetson Nano GPU, which performs about half a trillion operations per second (TOPS) and can scale up to a full rack of NVIDIA T4 GPU servers, which performs more than 10,000 TOPS. The lower end is ideal for tasks such as image recognition, where the high end would be for real-time AI tasks like real time speech translation and recognition.

CPUs can’t meet the demands of AI

In all cases, EGX brings the benefits of GPU computing to AI. At a recent event, I caught up some folks from Intel, and we discussed edge AI and the challenges there. While we all agreed the edge is where the action is, Intel’s plans seem to revolve around its current Intel Xeon Scalable CPUs and vector-neural network instruction (VNNI) extensions. I have a lot of respect for Intel, because the company pioneered the concept of computing everywhere; however, it’s belief that any kind of AI, including edge AI, can be done with CPUs instead of GPUs shows how clueless it is in that area.

I’m not diminishing the value of CPUs or Intel—every computer needs them—but it’s been well documented by many companies that Moore’s Law is fast approaching its limit, and the demands of AI go far beyond what CPUs are capable of achieving.  

The edge is where the action will be

EGX meets the growing demands of edge computing. For years, conventional wisdom was that all data would move to the cloud. However, not all data is best analyzed in the cloud. Smart cities, retail, oil and gas and other use cases make sense to analyze the data where it’s created, and that’s most often the edge. In fact, with applications like real-time facial recognition at an airport, the amount of time taken to send the data to the cloud, do the analysis and send it back would be too slow to be done in the cloud. This has not only validated the edge but also raised its value. During the next several years, the use cases for AI inferencing will explode, and the EGX server ensures that businesses deploy optimized hardware and software.

There are many makers of GPUs today, but what keeps NVIDIA out in front is its ability to deliver a “full stack” to simplify the process of deployment. There’s no “easy button” for AI, but EGX, like NVIDIA’s other engineered systems, puts the right technology where it’s needed so organizations can worry about deriving insights from data instead of being concerned with how to cobble together AI puzzle pieces.

Zeus Kerravala is the founder and principal analyst with ZK Research. He spent 10 years at Yankee Group and prior to that held a number of corporate IT positions.