3 Tips for Speeding Up Interrupt Handlers

Embedded software developers today are a bit spoiled. Many microcontrollers come with an ecosystem that includes peripheral drivers, an RTOS, middleware and even example application code. Many developers can spend most of their time in high-level application code, ignoring the software that meets the hardware. The problem is that while this prebuilt ecosystem can accelerate development, that acceleration is often at the cost of clock cycles and execution efficiency. In today’s post, we will explore several tips developers can apply to help improve the efficiency of their interrupt service routine callbacks that are tightly integrated in many microcontroller software frameworks.

Prerequisite #1 – Measure ISR Execution Time

The first step to speeding up software execution is to stop and take some measurements. How do you know if your interrupt handlers are using too much CPU time or running slowly? You measure it! There are several different options that developers can leverage to measure interrupt execution times.

First, simply toggle a GPIO line! I will often initialize a test GPIO line to high and then when I enter an ISR I will toggle the GPIO line low and then when exiting the ISR toggle the GPIO line high again. The result is an active low signal that approximately represents the ISR execution time. The reason the measurement is approximate because it does not consider the time to toggle the GPIO line, which we assume is negligible (but may not be if you are using framework code!). This method produces a simple and easy waveform to measure as can be seen below:

The second method, which I will just briefly mention is to use trace software. If you are using an RTOS, the RTOS will often have a way to record events that are happening in the system including entering and exiting an interrupt service routine. Developers can use their trace analyzer to understand how long their interrupt service routines are executing for.

Now on a first glance, the 24.3 us measured above may not seem too bad for an ISR. It really depends on the application whether this is good or bad, but in general, we want our ISR execution time to be as short as possible. In this example, I set up an input capture peripheral to measure an incoming signals frequency. If the frequency is a measly 20 KHz, this ISR will eat ~50% of the CPU cycles!

Tip #1 – Inline Functions Called in an ISR

First, calling a function from an ISR is a bad idea! The function call overhead will add a whole bunch of wasteful clock cycles to the interrupt which will delay getting back to the regularly scheduled code execution. The problem though is many modern frameworks do this! For example, if you look at the generated timer interrupt from STM32CubeIDE, you’ll see something like the following:

void TIM2_IRQHandler(void)
{
  /* USER CODE BEGIN TIM2_IRQn 0 */
  HAL_GPIO_WritePin(TxTest_GPIO_Port, TxTest_Pin, GPIO_PIN_RESET);

  /* USER CODE END TIM2_IRQn 0 */
  HAL_TIM_IRQHandler(&htim2);

  /* USER CODE BEGIN TIM2_IRQn 1 */
  HAL_GPIO_WritePin(TxTest_GPIO_Port, TxTest_Pin, GPIO_PIN_SET);
  /* USER CODE END TIM2_IRQn 1 */
}

Now, I added in the GPIO HAL calls, but you can see that by default, the interrupt makes a call to HAL_TIM_IRQHandler which is a generic interrupt handler for all timers on the STM32. (This is a great framework idea for reusable and portable code, but it can be detrimental to code that is time sensitive). If we examine the definition for HAL_TIM_IRQHandler, we find the following:

void HAL_TIM_IRQHandler(TIM_HandleTypeDef *htim)
{
    // Body removed for brevity
}

There is no attempt here to tell the compiler that we are in an ISR, so the compiler will likely add the code for a function call and add useless cycles to the ISR. In fact, this function will conditionally check and call several functions, which may make things even worse. Inlining the function can potentially decrease the execution time at the expense of a slightly larger code size. This is done by just adding the inline keyword to the function definition as shown below:

inline void HAL_TIM_IRQHandler(TIM_HandleTypeDef *htim)
{
    // Body removed for brevity
}

Taking before and after measurements, in this case I find that I can shave 0.2 us off the interrupt execution time. Not huge, but in a time sensitive application, it’s something.

Tip #2 – Customize the Default Interrupt Service Routine (ISR)

Pre-built frameworks will often lump together interrupt handling for a peripheral type. For example, the timer interrupt we just looked at, it passed a timer object and then has a bunch of conditional statements to decide what it should be doing. The framework is built for reuse, NOT execution speed. If I rewrite my interrupt to remove all this generic function calling, the interrupt execution time becomes 21.712, which has now saved us 2.5 us (10.3%)! For the numbers we are looking at, it doesn’t seem like much, but if this is a high frequency interrupt, that can be a ton of CPU utilization.

Tip #3 – Optimize Interrupt Service Routine (ISR) Callback Functions

I often notice that example code for various capabilities are written to show developers how something can be done. For example, many vendors will provide input capture code that shows how to calculate the duty cycle and frequency of a signal. This is fantastic, except that the code is often executed from within an interrupt service routine. This is suboptimal. In fact, the example I’ve just been showing through-out this blog are all related to calculating frequency using input capture. 21.712 us is a long time for an interrupt to run when you are measuring a signals frequency.

Example code is just that, an example. The algorithms are often correct, but they are not done in a production intent manner. They may not consider important considerations such as CPU load and real-time response. They just want to show you that their part can do what you need, measure frequency or whatever the feature is.

I’m not going to show the interrupt service routine code here, that may be saved for some future blog. However, I will share with you that after I went in and rewrote the interrupt handler, the execution time went from 21.712 us down to 7.472 us! That is a difference of 14.24 us or 34.4% of the original value! All because I intervened in the example code and followed best practices for writing interrupt handlers.

Conclusions

It’s fantastic that developers today have so much example code and so many frameworks provided out of the box for us to leverage. It’s important to note though that this code may not be designed or implemented for our own purposes. It’s often written quickly to demonstrate a feature or capability and not designed for production.

In this post, we saw I took a simple input capture frequency measurement example that was using an interrupt service routine that was taking 24.4 us to execute and after some very simple adjustments, was able to achieve a 7.472 us execution time. That’s less than a third of the processing power of the original implementation! (And the portability and maintainability of the code and its response time was unaltered).

This example should make you ask yourself, how much are you trusting your example code and how much processing power are you wasting?

Simplifying Concepts.
Accelerating Innovation.

3 Tips for Speeding Up Interrupt Handlers

Share >

Simplifying Concepts.Accelerating Innovation.

Share >

GET EXPERT TIPS & ADVICE DIRECTLY FROM JACOB

Simplifying Concepts.
Accelerating Innovation.