3 Tips for Data-Centric Software Design

An elegant solution to many embedded software systems problems is leveraging data-centric software design. Data is at the heart of every embedded system. First, data is acquired through sampling sensors, communication interfaces, and input/output devices. Next, data is converted, filtered, and processed into new data assets in the system. Finally, that data is then acted upon to create outputs. My first principle in software design is that data dictates design. Here are three tips for data-centric software design.

Tip #1 – Follow the data

Every software system at its core is centered on the data. Designing an effective and efficient software architecture and implementation must follow that data. Unfortunately, there is a limited number of things that data can do in a software system.

First, data is produced. For example, in an embedded system, there might be an analog sensor that produces a voltage measured by the microcontrollers’ internal analog-to-digital converter. Touch screen inputs may interrupt and provide an x/y coordinate pair. Data is often produced as an input or output of the system. Data-centric software design carefully identifies the inputs and outputs because that is where we will find produced data.

Next, data is processed. That analog-to-digital conversion we just mentioned that produces ADC counts is in raw form. The sensor data might be processed to create a floating-point numeric value representing the sensor’s scientific units. Usually, it’s more common to leave the sensor value in the raw form but filter the data. For example, the data may have a low-pass or high-pass filter to remove noise. Designers need to follow the data to understand how it is processed.

Finally, data is stored. Data might be stored in memory as a single value. Data may be stored in an array of values that will be processed together. Data might be saved to non-volatile memory for later use. Again, understanding how data is stored is critical to the design process.

Whether a sensor or a communication interface produces your data does not matter. It does not matter if you are storing it on an SD card or a memory chip. From a design perspective, we care about where data comes into and out of the system (produced), is processed, and then finally stored. To create an effective software design, you must follow the data. Document the data in each state and how it is transformed. If you do that, you’ll discover that the design naturally falls into place.

Tip #2 – Document how data changes

In every system, raw data is transferred and converted into valuable data that is then output. Following how data flows through a system can drive the software architecture well; however, documenting how data changes and evolves throughout the system can help a designer scope the magnitude of the software.

For example, if I have some analog sensor data converted to a number and displayed in telemetry data, I know there isn’t much processing power required. The data processing is simple. However, if the sensor data is retrieved, stored in a circular buffer, filtered, then converted to a scientific unit, I know I have a more extensive scope for that data asset. Documenting the data flow and processing events can help designers to constrain and understand their design, even without the official microcontroller or hardware ever being identified.

At the start of a design, follow these simple steps:

  • Identify the data produced in the system
  • Ascertain how each data asset moves through the software
  • Document how the data changes throughout its lifetime
  • Record any storage mediums and how the data is accessed

I’ve found creating a simple data flow diagram can dictate the software architecture and ensure that it is scalable and not overly complicated.

Tip #3 – Act on the data at the start of an event or task

A typical task in an embedded system will process information in the following steps:

  • Retrieve new data
  • Filter/process data
  • Output result

No surprise here, right? The process seems logical. Unfortunately, it doesn’t necessarily fit with a real-time embedded system design. A primary real-time system goal is to be deterministic and minimize jitter. The above process can maximize the potential for jitter rather than minimize it. Let’s consider an example.

A motor control task reads in the latest analog current and voltage measurements and then runs it through a PID controller. The PID controller will provide an output for the new motor state. Each pass through the PID controller may not run in the same number of clock cycles depending on the controller design. There can be a wide range of time values if one includes the possibilities for interrupts to fire in the middle of the PID calculations. The result is that the motor output is never updated simultaneously. It moves around a bit, and jitters, which could affect the motor’s response.

A surer way to act on the new data output for the motor that minimizes jitter is to change the steps to:

  • Output previous result
  • Retrieve new data
  • Filter/process data

In this case, the new output is always sent to the motor at the start of the loop before further processing, or other non-deterministic activities occur. So, yes, we are acting on data that might be slightly older, but we can account for that and reduce our jitter dramatically.

Conclusions

It all comes down to the data when you break down what we are doing to design and implement embedded software. Embedded software is nothing more than collectively processing, storing, and outputting data in a deterministic manner. Following a data-centric software design approach will result in an architecture that highlights the data’s most important. You’ll discover that the design is efficient, effective, and not bloated with modern design patterns and fillers. If you want an excellent software design, just follow the data.

2 thoughts on “3 Tips for Data-Centric Software Design”

  1. Thanks for the insights, I’ve been thinking about this a lot in the last period, and I still have some open questions. It seems to me that we can differentiate between designs where the data are stored for later use (something like the data model proposed by “Patterns in the machine”) and designs where data are passed on the fly as event parameters. Unfortunately, I haven’t found any in-depth resource explaining how to build a convincing “stored-data” model (PIM itself doesn’t really dig into it). Personally, I have the impression that such a model is appropriate when using a superloop (where the later task retrieves the data produced by an earlier task) or with periodic tasks, whereas in the case of tasks waiting for a data (which is maybe more common, at least in my experience) an “event-data” approach is preferrable. Does it make any sense to you or do you know any interesting resources about this topic?

    1. Thanks for the comments.

      I’ve generally seen the approaches you’re asking about embedded within the context of designing applications with an RTOS. Unfortunately, the focus is usually on the RTOS, not the data.

      I can’t think of any resources immediately off the top of my head. However, the model is useful. I often use a similar technique to share data between ISR/Task and Task/Task code.

      I’ll put this on my list and see if I can write something up on it in the near future.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.