New Arm Chip Handles Machine Learning Workloads on Mobile Devices

Arm’s new Project Trillium platform includes special processors for mobile devices to efficiently run machine learning, artificial intelligence and object-detection workloads.

Arm Trillium Chip

Arm has made its living at the edge of the network, with its processor architecture powering most smartphones and many of the other devices that business users and consumers rely on in their work and home lives every day.

Now the company wants to bring improved machine learning capabilities to these devices through a new platform that includes highly scalable processors and other components designed to deliver the compute power for artificial intelligence operations within the low-power ranges these devices demand.

Company officials this week introduced Project Trillium, which includes a processor specifically made to run machine learning (ML) and neural network workloads, another processor for object detection and software to leverage such neural network frameworks such as Google’s TensorFlow, Caffe and Android.

The platform will enable mobile device users to run more than 4.6 trillion operations per second (TOPs)—though officials said the performance could be even higher in real-world use cases—and do so within a power budget of one to two watts.

“The growth of machine learning represents the biggest inflection point in computing for more than a generation,” Jem Davies, fellow, vice president and general manager of Arm’s Machine Learning Group, wrote in a post on the company blog. “It will have a massive effect on just about every segment. … Project Trillium represents a suite of Arm products that gives device-makers all the hardware and software choices they need.”

The rise of the internet of things (IoT) and the cloud are rapidly changing where and how computing is being done. The tens of billions of connected devices, systems and sensors that make up the IoT are generating massive amounts of data that businesses, governments and researchers need to collect, store, process and analyze in as near to real time as possible.

At the same time, attempting to bring all that data back to central data centers for processing is too expensive and takes too much time. So more of the compute, storage and analytics are being done at the network edge, closer to where the data is being generated.

With its low-power architecture and dominant presence in mobile devices, Arm is looking to become a key player at the network edge and in the IoT, a focus that has become even more pronounced since it was bought by SoftBank for $32 billion in 2016. Much of its TechCon show last year revolved around the IoT and included the unveiling of a platform for developing secure connected devices.

With Project Trillium, the goal is to give these devices the compute power and energy efficiency to run machine learning operations, even if they’re not connected to the cloud, Davies wrote. Initially the technologies in the platform will be optimized for mobile devices and smart IP cameras, but they will be able to scale up and down to handle such devices as sensors, smart speakers and home entertainment, officials said.

“We already see devices running ML tasks on Arm-powered devices in products such as smart speakers featuring keyword spotting,” Davies wrote. “This will continue and expand rapidly. At the high end, there is ML inference (analyzing data using a trained model) being performed in connected cars and servers, and we have an ability to scale our technologies to suit those applications too. We now have an ML processor architecture that is versatile enough to scale to any device, so it is more about giving markets what they need, when they need it.”

The machine learning processor will be able to run more than 3 trillion operations per second per watt, giving devices both the power and efficiency to run such workloads. The object detection processor will include real-time detection with Full HD processing at 60 frames per second and offer up to 80 times the performance of a traditional digital signal processor (DSP). The combination of the two will give users high-end people and face detection and recognition capabilities, officials said.

The new Project Trillium platform will be available for early preview in April and generally available in the middle of the year.

 Arm’s introduction of the platform comes the same week that Google officials announced they are making their own homegrown processors for machine learning available in beta on the Google Cloud Platform.

Google’s Tensor Processing Units (TPUs) are designed to deliver high performance machine learning capabilities for Tensorflow-based workloads. With the TPUs, users will be able to run train and run their artificial intelligence (AI) workloads programmed with the TensorFlow software library faster than one system with GPU accelerators and at a lower cost, according to officials.