Amazon FreeRTOS: Behind the Scenes

I recently wrote an article for embedded.com entitled “A Peek Inside Amazon FreeRTOS” where I started to analyze the high-level behavior that the Amazon FreeRTOS application exhibits while running on an STM32F475 IoT Discovery Node. In this post, I will continue the analysis by demonstrating how a developer can follow the communication flow and examine the code execution without reviewing a single line of code.

It’s often said that a picture is worth a thousand words, and even in embedded software, this is still the truth. I’ve been using Percepio Tracealyzer to understand the Amazon FreeRTOS run-time behavior and one important aspect that we should be interested in as developers is how data and communication flows through-out the application. Using the communication view, Amazon FreeRTOS exhibits the following behavior:

In this single picture, we can easily see how the data flows through the application and what we should be looking for in the source. For example, we can see that there is:

A single mutex that is shared between the MQTT and TmrSvc tasks
There are two queues in the application
There is a single message buffer used to communicate between the MQTT and Echoing tasks
The TmrSvc, Echoing and MQTTEcho task can send information to the logging task through a queue
A queue is used to send information to MQTT from Echoing

Since Amazon FreeRTOS doesn’t include any of this information in the documentation, a developer would normally have to recreate this themselves by reviewing the source code. If they didn’t, they may not have the entire picture on how the software behaves and the chances can increase dramatically that bugs will be introduced when the code base is modified.

Seeing the communication flow is one thing, but a developer may want to understand more about what a particular task is doing. In order to better understand the Echoing task, I clicked on the last instantiation and reviewed the events that were generated when it executed. You can see the event data below:

In the event data, a developer should be able to see the events that match the communication flow. Waiting for the message buffer, receiving the buffer, sending data to the MQTT queue and the logging queue and then waiting again on the message buffer. Closer scrutiny does reveal that there are malloc events that are occurring during run-time. If this application has hard real-time requirements, we may want to flag this since malloc is often non-deterministic and could result in other performance issues such as memory fragmentation. Notice also that we are only seeing memory being dynamically allocated and no memory is being released! So, either we have a memory leak in the application or this memory is freed elsewhere in the application.

Tracking down the free statements could require us to dig into the code and match each malloc and free statement. Instead, I noted that this task was executing around 2 minutes and 7 seconds and filtered the event log for instances of free. Clicking on the event, a developer can see that the memory that is allocated in Echoing is used to store information that is also used by the Logging task. Once the Logging task is done with the data, it frees the memory location. The trace information can be seen below:

Scrolling through the event log, I estimated that there must be several hundred calls to free throughout the approximate 3-minute execution time. Just out of curiosity I exported the log and discovered that free is called 57,769 times! That’s 321 calls to free per second! (Which raises concerns that the heap could become fragmented). For a production system, we may want to statically allocate a buffer or use a memory pool to limit allocating and freeing memory so much which is undoubtedly eating away at available CPU cycles.

Now there was one last item that I wanted to investigate further before closing out this post. Several weeks ago, I wrote an article for DesignNews entitled “The Unknown and Hidden Behaviors of Real-Time Embedded Systems” where I noticed that a deleted task in FreeRTOS 9.0.0 was never quite deleted. I found this to be curious behavior and wanted to see if Amazon FreeRTOS, which is based on FreeRTOS 10.0.0, exhibited the same behavior. I used the event viewer to find the only two instances of vTaskDelete in the application and reviewed them. The trace can be seen below:

After calling the vTaskDelete function, the Echoing and MQTTEcho task no longer appear in the trace data. This means that the phantom task execution that I saw in FreeRTOS 9.0.0, no longer appears in FreeRTOS 10.0.0, which is quite reassuring.

As we’ve seen through-out this article, using trace technology can be extremely useful to analyze a code base which has no documentation or that we want to better understand from an execution stand-point. We’ve also seen that Amazon FreeRTOS appears to behave in a predictable manner and also exhibits some behaviors that a developer working on a production intent system may want to investigate further.

References