Firmware Error Handling using Do while Loops

An interesting area of a code base to examine is error handling. I’ve found that many firmware and embedded software projects don’t do an excellent job managing errors and faults. For example, there is a lot of optimism that if I want to transmit a character over a serial interface, the character will be transmitted 100%, no matter the circumstances. While optimism in a lab setting is expected, for production code, developers should ask themselves what can go wrong and how they should handle errors and faults. This post will explore the types of firmware error handling I often see in open source and production code and how we can create better error handling using do while loops.

Typical Firmware Error Handling

The type of error handling I often encounter can’t be considered error handling at all. In fact, error handling is nearly non-exist, and if something goes wrong, the system will end up in bad shape. For example, if you were to examine the start-up code in your microcontroller vendor’s support package, you would likely find some code that starts up the microcontroller’s internal and/or external oscillator. The code usually looks something like the following:

static void GCLK2_Initialize(void)
{
    GCLK_REGS->GCLK_GENCTRL[2] = GCLK_GENCTRL_DIV(48U) | GCLK_GENCTRL_SRC(6U) | GCLK_GENCTRL_GENEN_Msk;

    while((GCLK_REGS->GCLK_SYNCBUSY & GCLK_SYNCBUSY_GENCTRL_GCLK2) == GCLK_SYNCBUSY_GENCTRL_GCLK2)
    {
        /* wait for the Generator 2 synchronization */
    }
}

Do you see any problems with the error handling of this code? I can spot several right off the bat. The biggest issue I see is that if the initialization fails, we end up in an infinite loop that prevents the system from booting!

Let’s look at another example. What if I want to transmit characters over a serial interface like SPI. It’s pretty common to come across code that looks like the following:

TransferAccepted = false;

// Send and receive the data
while(TransferAccepted != true)
{
    TransferAccepted = SERCOM4_SPI_WriteRead(&TxBuffer[Device][0], TxSize, &RxBuffer[Device][0], RxSize);
}

The code above checks the return value from the transmit function, which is good, but if the transfer fails, the code will get stuck in an infinite loop. So I’m not sure this is much better than other code I see, which just assumes that the transfer was successful, as shown below:

SERCOM4_SPI_WriteRead(&TxBuffer[Device][0], TxSize, &RxBuffer[Device][0], RxSize);

Do while loop in c

One of the most underutilized loop statements in firmware and embedded software systems is the use of do while loops. The structure of a do while statement is quite simple:

do
{
    // some stuff
}while(conditional == true);

The significant difference between using a do while loop versus a while loop is that the statements within the do while block will execute at least once. The fact that the code runs at least once can help simplify error handling code. For example, if I want to use a while loop, I usually have to create extra variables and set their states before the loop, as shown below:

bool TransferAccepted = false;

while(TransferAccepted == false)
{
    TransferAccepted = SERCOM4_SPI_WriteRead(&TxBuffer[Device][0], TxSize, &RxBuffer[Device][0], RxSize);
}

The same code can be rewritten and simplified using a do while loop as shown below:

do
{
    bool TransferAccepted = SERCOM4_SPI_WriteRead(&TxBuffer[Device][0], TxSize, &RxBuffer[Device][0], RxSize);
}while(TransferAccepted == false);

The above code is a good step towards better firmware error handling, but we aren’t quite there yet. We are still plagued by the potential for infinite loops to occur.

A general pattern for error handling with do while loops

As embedded software developers, we must recognize that while the likelihood that a peripheral locks up or stops responding may be small, when we ship thousands of products for several years, the chances that it will happen in the field do become non-zero. Depending on the type of products you design, handling the fault correctly might be insignificant or result in significant lawsuits again your company.

There is a simple code pattern that I use whenever I am going to interface with hardware. The code pattern does several things:

  • Runs the desired code to interact with the hardware at least once
  • If the first run were unsuccessful, it would retry the interaction
  • If a predetermined number of retries has been reached, the code will not hang but will set an error and move on.

As you might suspect, a do while loop with some additional logic can help us to achieve the above requirements. The template for the implementation looks like the following code:

do
{
    // Add some stuff to do


    RetryCount++;

    if(RetryCount == RETRY_COUNT_MAX)
    {
        // Set fault or warning to notify application
    }
}while((TransferAccepted == false) && (RetryCount < RETRY_COUNT_MAX));

The code is simple and compact and can easily be reused for any hardware-dependent or even hardware-independent application calls. If I were to write some code to interact with the SPI bus, I can take the template and with a few modifications end up with something like the following:

do
{
    bool TransferAccepted = SERCOM4_SPI_WriteRead(&TxBuffer[Device][0], TxSize, &RxBuffer[Device][0], RxSize);
    RetryCount++;

    if((RetryCount == RETRY_COUNT_MAX) && (TransferAccepted == false))
    {
        Fault_Set(SPI_TRANSFER_FAILED);
    }
}while((TransferAccepted == false) && (RetryCount < RETRY_COUNT_MAX));

Firmware Error Handling Conclusions

The state of error handling in the embedded software industry today is quite dire. We often assume that everything will work the way it’s supposed to. Unfortunately, that is not the case. We need to assume that interactions with hardware might fail. If they do, we may want to retry the exchange again before setting an error. In this post, we looked at how developers can use the do while loop to build some simple error handling logic that allows them to retry an interaction before setting an error.

Share >