Earlier this week, Arm announced a new Cortex-M processor that is going to revolutionize how embedded system developers build IoT devices, the Cortex-M55. The Cortex-M55 processor is building on the Armv8.1-M architecture to provide developers with more powerful features that will change how IoT devices are designed and built. The Cortex-M55 processor will make IoT development easier and simpler, especially when it comes to working with machine learning applications which many embedded developers are still trying to wrap their minds around. In fact, Arm also announced a second processor, the Ethos-U55, a micro neural processing unit (NPU), which when coupled with the Cortex-M55 will result in dramatic uplift in machine learning processing capabilities. In today’s post, I’m going to examine five reasons the new Arm Cortex-M55 will transform the IoT.
Reason #1 – Machine Learning Performance Improvements through Helium
If I had to choose just a single feature to highlight for the Cortex-M55, it would be that it supports Arm Helium technology. Helium is an optimized vector extension architecture that brings Neon-like compute capabilities to Cortex-M processors. Helium improves digital signal processing (DSP) performance by as much as 5x and can improve machine learning performance by up to 15x!
There are several different ways that Helium accomplishes this dramatic performance improvement such as
- Optimized SIMD instructions to process multiple data in a single instruction
length 128-bit vectors for
- Gather load and scatter store
- Low overhead loops
- Branch prediction
datatype support such as
- Half and single precision float
- 8-bit, 16-bit, 32-bit and 64-bit vector data types
- Complex math support
- FPU register bank reuse
All these new vector extensions will make running a machine learning inference at the endpoint far faster and energy efficient. I particularly found the average DSP kernel performance per datatype between the Cortex-M55 and other Cortex-M processors extremely interesting. Take a look at the comparison below to see how well the Cortex-M55 performs:
(Image Source: Arm)
Reason #2 – Expand Local Compute Use Cases
As machine learning moves from the cloud to endpoint, the number of use cases are expanding at an exponential rate. The most common use case for machine learning at the endpoint on a microcontroller right now is for keyword spotting. With the Cortex-M55 and its built-in Helium technology, developers will be able to dramatically expand the number of use cases that can be covered by machine learning on-device. For example, there is already an emerging need for sensing and control solutions in ultra-efficient small device applications such as:
- Vibration and motion
- Voice and sound
- Vision and image
These applications will allow machine learning to be used in robotic applications, predictive maintenance, voice control and object detection applications. These applications will span across multiple industries and even into space at the moon and beyond. In fact, you can get a feel for the use case coverage from the following diagram:
Reason #3 – Simplified Development Model
One of the problems with today’s development toolchains for intelligent endpoint devices is that developers must work out of three separate toolchains:
- The Cortex-M toolchain
- The Digital Signal Processing toolchain
- · Neural Processing Unit toolchain[RC1] [JB2]
Working out of three different toolchains can create unnecessary complexity, drive up development time and even costs.
With the Cortex-M55, the toolchains are completely integrated! This will result in lower system complexity and a simplified programmers’ model by bringing all the development under a single toolchain. One toolchain also means that costs can be better controlled, and there will be fewer integration issues.
Reason #4 – Built-in Security with TrustZone
An important piece to every IoT application is the need for security. The Cortex-M55 supports TrustZone, which creates a hardware-based isolation layer between a Secure execution environment and a Non secure execution environment. This allows developers to follow Platform Security Architecture (PSA) best practices and ensure that all the pieces are in place to secure their IoT application at the endpoint.
Reason #5 – Integration with a micoNPU, the Ethos-U55
One of the most interesting reasons that the Cortex-M55 will help to transform the way that IoT devices are designed and built is that they can be integrated into Arm’s Ethos-U55 microNPU (micro neural processing unit). This new processor combined with the Cortex-M55 can improve machine learning performance by as much as 480x!
The Cortex-M55 holds a lot of promise for how it can help transform the IoT and bring applications from the cloud to the endpoint. A major component that allows this to happen with the Cortex-M55 is the Helium vector extensions that will provide DSP and machine learning applications a significant performance improvement. The performance increase will open up new potential use cases for endpoint devices which will be more readily achieved through a simplified and unified programmers model.