Engineers at Fujitsu Laboratories are focusing on the memory within graphics cards to speed up the task of machine learning on neural networks.
Fujitsu Labs this week announced new technology that officials said streamlines the internal memory of GPUs to meet the growing demand for greater scale of neural networks, which are among the foundations of the drive toward artificial intelligence (AI). Tests have shown that technology essentially doubles the capability of neural networks while reducing the amount of internal GPU memory used by more than 40 percent.
GPUs are finding their way into a growing number of areas—such as high-performance computing (HPC)—where they can be used as accelerators for CPU-based systems. The key is being able to run applications and workloads in parallel, which can significantly help improve the performance of systems running particular workloads while keeping a lid on power consumption.
GPUs also are playing key roles in the growth of machine learning and AI, including in the deep learning area of training neural networks. With AI, systems can collect input (such as voice commands or images from the environments around them), process the data instantly and then react accordingly, basically acting similar to a human brain by learning and reacting to their surrounding environments and experiences.
There essentially are two parts of machine learning: training (where neural networks are taught such things as object identification) and inference (where they use this training to recognize and process unknown inputs, such as Apple’s Siri understanding a user’s question and then responding correctly). Most of the training tasks on neural networks are done by GPUs while inference work tends to be done on CPUs from Intel.
However, both Nvidia and Intel are looking to expand their capabilities, with Nvidia eyeing inference workloads for its GPUs and Intel looking to push into training with its x86-based chips.
Fujitsu Labs officials said that over the past several years, GPUs have increasingly been relied on to handle the huge amount of calculations needed for deep learning processing. To run these high-speed workloads, data was used in the calculations needed to be stored in the GPU’s internal memory, which hindered scale because it was limited by the memory capacity, they said.
The new technology by Fujitsu is designed to improve memory efficiency by essentially reusing the GPU’s memory resources rather than simply growing the amount of internal memory. According to Fujitsu Labs officials, when the machine learning begins, the structure of every layer of the neural network is analyzed and the order of the calculations can be changed so that the memory space in which larger data that has been allocated can be reused.
The technology analyzes the neural network and ensures that the way calculations and data are handled uses the memory space of the GPU more efficiently. The result is that organizations can expand the scale of neural networks that can run learning workloads at high speeds on a single GPU. In addition, models resulting from that learning are more accurate.
The more layers a neural network has, the more accurate it performs such jobs as object identification and categorization. However, as neural networks have grown to improve accuracy, that growth has lengthened learning times. Some organizations have turned to using multiple GPUs in parallel, in a way similar to what’s done in supercomputers, but that slows down the learning speed.
Fujitsu Labs researchers have implemented in the Caffe open-source deep learning framework and measured the use of GPU internal memory. Evaluating the technology with AlexNet and VGGNet networks showed that it enables GPUs to scale the learning on neural networks while driving down the use of GPU internal memory.
Fujitsu Labs officials outlined details of the technology earlier this month at the IEEE Machine Learning for Signal Processing 2016 event, and the plan is to commercialize the technology by making it part of Fujitsu’s AI technology.