5 RTOS Design Best Practices

RTOS design has become critical to many embedded applications. RTOSes are used in more than 50% of embedded applications and with so many devices becoming connected and starting to use machine learning, those numbers will only go up. When designing an RTOS-based application, there are many caveats and it is easy to overlook best practices. In today’s post, let’s explore 5 RTOS Design Best Practices that I often see teams completely overlooking.

RTOS Design Best Practice #1 – Data Dictates Design

A mantra or design philosophy that I have adopted over the years is that good software design is driven by the data. Put another way, data dictates design. If you take a moment to consider this, it makes a lot of sense. Most systems are real-time systems where events generate data. That data in turn must flow through the application in various ways, be processed, and then result in either storage or output.

When starting an RTOS design, or even any embedded application design, start by identifying all the data sources in the application. I like to do this by first creating a list. Next, I draw blocks in a diagram and label the data sources. Finally, I map the data sources to their final destinations. Marking how the data is converted, how it is processed, and what application areas consume the data. By the time I’m done, the tasks, data storage, synchronization mechanisms, and so forth naturally come out of the data flow.

RTOS Design Best Practice #2 – Use RMS to Verify your Design

RMS, most famously known as Rate Monotonic Scheduling, is an analysis technique that designers can use to test their assumptions about whether the tasks in their system can be scheduled successfully. I often see teams just create tasks and assume that their system is going to work; however, there is no guarantee even with an RTOS involved. It may not be possible to schedule all the tasks successfully and have them all meet their deadlines.

There are several models that exist for RMS. The most basic model assumes:

  • Tasks are periodic
  • Tasks are independent
  • Preemptive scheduling is used
  • Each task has a constant worst-case execution time
  • All tasks are equally critical
  • Aperiodic tasks are limited to start-up and failure recovery

At first glance, some of these assumptions seem very unrealistic for the real world, yet, most designs can be broken down and verified using them. (More complex models improve these assumptions). An example analysis can be seen below:

RTOS Design Best Practice Rate Monotonic Scheduling

RTOS Design Best Practice #3 – Task Decomposition starts from the Outside-In

Decomposing an application into tasks can be challenging. Developers often find themselves asking questions like:

  • Do I have too many tasks?
  • Do I not have enough tasks?
  • Can all these tasks be scheduled?
  • Should these tasks be combined or left separate?

When starting to decompose your applications, the best way to start is from the outside in. Start by looking at the hardware devices and the inputs and outputs of your system. Look at the data and the rate the data is produced. The inputs/outputs and hardware with their data flow will help to identify the major tasks in the system. For example, you might end up with a simple diagram like the following:

RTOS Design Best Practice Task Decomposition

The above diagram identifies five main tasks and then an application block that can be further decomposed.

RTOS Design Best Practice #4 – Decouple your RTOS using an OSAL

Unfortunately, I see too many companies that build their entire application around their RTOS. The RTOS is supposed to be a component in the application, NOT the foundation of the application. The problem is that developers don’t have control over the RTOS. If the RTOS changes, the application is not immune to those changes. If the RTOS is suddenly no longer available or purchased by someone, changing to a different RTOS can be as painful as rewriting most of the application.

The best practice here is to use an Operating System Abstraction Layer (OSAL). It can fit into the software architecture quite elegantly as shown in the diagram below:

RTOS Design Best Practice OSAL

Notice that the RTOS is just another component in the middle of the application stack up. If we wanted to swap out to a new RTOS, all we need to do is change the OSAL mappings to the new RTOS. The application will have no clue that something changed!

The OSAL essentially acts as a dependency barrier and will use generic OS calls. For example, every RTOS has a semaphore, mutex, and so forth. The OSAL provides a generic API for the common RTOS features. If there are OS-specific features needed that aren’t in a common OSAL like CMSIS-RTOS2, then developers should write their own extension to the OSAL. This will continue to limit the coupling and dependency of the application on the RTOS. After all, you never really do know when it will change.

RTOS Design Best Practice #5 – Don’t use Semaphores as Mutexes

My last design best practice for you today is a simple one, but one that again I see time and time again. Mutexes and Semaphores are designed for different purposes. A mutex is designed for providing mutually exclusive access to a resource. A semaphore is designed for task notification and coordination.

I often see designers and developers use a binary semaphore as a mutex. A mutex can lock/unlock a resource. A binary semaphore can be given or taken, resulting in a state that looks a lot like lock/unlock. However, there is an important difference between these two. The mutex has a feature called priority inheritance. Priority inheritance in the situation of a priority inversion can elevate the priority of a task to minimize the impact of the priority inversion.

A semaphore does not support priority inheritance, and can therefore when used as mutex result in priority inversions and other design issues. Make sure you understand these differences and never use a semaphore to protect data access. Use the right tool, which is a mutex.

Conclusions

RTOS Design is becoming or has been, a common activity that embedded developers need to perform. Best practices are not widely advertised and teams often have to struggle through them the hard way. The biggest struggles I often see can be alleviated by the 5 RTOS Design Best Practices that we just examined. Follow them carefully and you can avoid a lot of heartaches.

 

If you are interested in RTOS training the following resources might be of interest to you:

4 thoughts on “5 RTOS Design Best Practices”

    1. It’s a good question. I’ve not played much with ROS yet, but I’ve had quite a few inquiries like this about it. I’m going to add this to my list and see if I can’t put something together in the future on this. I appreciate you asking and sharing the resources.

  1. Hello, Jacob

    In Practice #2, list:
    “Aperiodic tasks are limited to start-up and failure recovery”
    How to understand it means exactly?

    1. So an aperiodic task is one that does not recur at a regular interval. For example, a task that occurs every 5 ms, 10 ms, 100 ms, and so forth. So the analysis assumes that aperiodic tasks only occur at start-up or if the system fails. Usually, I will provide a worst-case periodic rate for my own analysis.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.