MIT researchers believe they have found a way to bring deep-learning capabilities to smartphones.
At the International Solid State Circuits Conference (ISSCC) last week, the researchers from the Massachusetts Institute of Technology introduced a 168-core chip that they said will enable smartphones and other mobile and embedded devices to run artificial intelligence (AI) algorithms locally, letting much of the work of collecting and processing data be done on the device itself.
Currently, data collected by these devices, systems and sensors are uploaded to the Internet for processing by power servers, after which data is then sent back to the device. This opens up an array of issues, from latency and network congestion to security and power consumption. In addition, it means that to process the data, the device needs to be connected to a network.
The new chip—which the researchers dubbed “Eyeriss”—is 10 times more efficient than a mobile GPU that typically has 200 cores and brings with it features that enable it to run complex neural networks on mobile devices rather than only in servers in data centers. And because the processing work is done locally, there doesn’t have to be an Internet connect or access to servers.
The possibilities enabled by running neural networks locally—making decisions regarding the raw data on the devices and sending only their conclusions to the Internet—are broad, according to Vivienne Sze, the Emanuel E. Landsman Career Development Assistant Professor in MIT’s Department of Electrical Engineering and Computer Science, whose group developed the new chip.
“Deep learning is useful for many applications, such as object recognition, speech, face detection,” Sze said in a report in MIT News. “Right now, the networks are pretty complex and are mostly run on high-power GPUs. You can imagine that if you can bring that functionality to your cell phone or embedded devices, you could still operate even if you don’t have a WiFi connection.”
Being able to process the data locally also offers greater security and privacy, and reduces transmission latencies, enabling users to react more quickly to the data that’s been collected and analyzed, she said. It could have a significant impact on the development of the Internet of things (IoT), enabling the billions of networked devices armed with artificial intelligence algorithms to crunch the data themselves rather than having to send the information to servers through the Internet. Instead, the conclusions derived from running the data through the algorithms—done locally on the device—would be what are sent via the Internet.
A growing array of tech vendors are putting a lot of money and effort into research behind artificial intelligence and deep learning, including IBM, Google, Microsoft, Qualcomm and Apple. GPU vendor Nvidia is making deep learning and neural networks a key part of the company’s product roadmap, with CEO Jen-Hsun Huang saying last year that the “topic of deep learning is probably as exciting an issue as any in this industry.” One of the researchers working on Eyeriss includes Joel Emer, a professor of the practice in MIT’s Department of Electrical Engineering and Computer Science and a senior distinguished research scientist at Nvidia.
Neural networks are created through a series of layers that include a large number of processing nodes; the nodes in one layer process the data coming in, then pass it on to the nodes in the next layer, with the goal being that the right result to a computational question is realized after the data is processed through multiple layers. In a convolutional neural network, the nodes in each layer process the data in different ways, which means they can be fairly large, according to MIT researchers.
These neural networks also are designed to learn from their experience, much like a human brain.
With Eyeriss, the goal was to develop a chip that enabled as much work as possible to be done on the device. What researchers wanted to do was reduce how frequently cores needed to exchange data with memory banks, which would cut the time and energy used by the device. In a GPU, the cores share a single memory bank, while with Eyeriss, each core has its own memory. In addition, the chip includes a circuit that compresses the data before sending it to the individual cores.
Other features include the ability for each core to communicate directly with those it immediately abuts, and the chip includes circuitry whose job it is to mete out the tasks to the individual cores.
The research was partially funded by Defense Advanced Research Projects Agency (DARPA).