It’s always exciting to start on a new development project, but some of the most interesting and challenging projects to work on in embedded systems can often be how to modernize a legacy code base. There are so many products that are currently on the market that have been in production for years if not a decade or more. These code bases, while very functional and feature rich, often were written with techniques that aren’t well suited for modern software development. For example, I often come across products that are very successful, but were written in a single file, monolithic implementation because that was how it was done 10, 15 or 20 years ago.
Teams often struggle with legacy code bases for reasons such as:
- The sheer number of features and code size (100,000’s of lines of code if not millions)
- Tight coupling between application code and hardware
- Antiquated tools and development processes
- Inability to effectively deploy new developers to work on the code base
While there may be a temptation to simply start over from scratch, the time and monetary investment to start completely from scratch can be unrealistic. In this post, we are going to examine several tips teams can follow to start modernizing their legacy code projects.
Tip #1 – Audit your Application Code
Before attempting to modernize any legacy code base, the team should audit the code base. An audit can be performed by a company team member, but I’ve often found that an external, 3rd party viewpoint can provide far more insights and overcome the internal bias and opinions that have often developed within a team. An audit can help the team understand exactly where their code currently stands and what it is that they must work with.
The output from an audit should be several-fold. First, as we have discussed, the team should understand where the code base currently stands. This should include understanding the current architecture (or lack thereof), features, function purposes and complexity. Second, a list of major challenges in modernizing the code base should be developed. There may be critical code areas, features, etc. that will require careful consideration and effort to modernize. These should be called out ahead of time. Finally, the output should be enough to provide recommendations for how to modernize the code base. These recommendations should be easily crafted into a software modernization plan that can be integrated with future feature development to modern the code in lock step with product development efforts.
Tip #2 – Generate a Function List and Dependency Graph
I’ve found that there are at least two useful function related resources that can be extremely useful for modernizing a code base; function lists and dependency graphs.
A function list can be generated by an automated tool and lists out all the functions
that are in the code base and organizes them based on the module that they reside in. A single file application will have just a single list, where an application broken up into 5 modules will contain 5 lists. A function list is helpful for monolithic applications that need to be modularize because the function list can be used to figure out which functions go together and should be grouped into a separate module.
A function dependency graph can also be automatically generated, and it helps a team understand what functions are calling which functions and how they relate to each other. A dependency graph is great for understanding function coupling and again can be extremely helpful in identifying functions that are related to each other. They can also be used to try and determine where natural boundaries in the code exist so that the software can start to be layered. They can also demonstrate poor architectural design and help guide in reworking the code so that clear boundaries exist between areas of similar concerns such as hardware drivers, application business rules and so on.
Tip #3 – Modularize the Application
There is a lot of code out in the wild today that is still written as a monolithic, single module code base, that is everything is in main.c. A first step to modernizing such as code base is to modularize the application. Breaking up the application can:
- Start to create separations of concerns
- Ease the code merging process, allowing multiple developers to work easily on the project
- Allow developers to take ownership over modules
- Speed up development
There are certainly many more advantages but breaking up the application into cohesive modules that contain similar functionality will go a long way towards modernizing the code.
Modularization also doesn’t need to start out as a strict, encapsulated, modern implementation. I still know developers today who when developing an application will put everything they are working on in main.c. Once the code works, they break it out into separate modules. It’s not my preferred way to develop, but I’ve seen them have great success with that technique. The same concept can be applied to a large legacy project, it’s just that there isn’t a single feature to separate but a decade’s worth of features and functionality. Start by taking baby steps.
Tip #4 – Let go of how things “are” or “were”
How things are done and how things were done helped get the product to where it is today, but they aren’t going to help get it to where it needs to go. Teams often get stuck in an “are” and “were” loop where the past constantly influences the developers thinking on where things are going. In order to successfully modernize a code base, its useful to understand how things are and were in the code, but developers need to let that go in order to develop the “will be”. I can’t tell you how many times I see teams get tripped up and unable to move forward because their mindset is stuck in how things were in their old architecture and implementation and they allow it to not just influence but restrict what they are able to do with their future developments.
Tip #5 – Modernize through Incremental Improvements
One method for modernizing a code base is to make improvements incrementally. I often see teams that try to make dramatic changes to their legacy code to modernize it. Unfortunately, dramatic changes often result in dramatic bugs and effort. Rather than trying to make huge structure changes all at once, or rewriting everything from scratch, a team can put together a product modernization plan which outlines improvements that will be made over several software releases.
I often recommend that teams start with low hanging fruit. Small changes that get the ball rolling and can have a noticeable improvement immediately. These changes could be things like:
- reducing function complexity
- Modularizing mission critical code
- Separating and isolating application and hardware dependent code
Small, incremental changes with each release, or each sprint, can have dramatic improvements to the code without having an adverse effect on product feature development.
In today’s post, we have just scratched the surface on how you can start to modernize a legacy code base. It’s important to recognize that modernization won’t be done fast or overnight if it is going to be done right. A successful modernization effort will be incremental, but the benefits can be nearly instantaneous if teams focus on the low hanging, high value areas that are identified in a code audit.
“A function dependency graph can also be automatically generated, and it helps a team understand what functions are calling which functions and how they relate to each other. ”
do you have any recommendations for tools that can generate such dependency graphs based on the source files?
I could only find Doxygen + Graphviz as a viable option which sometimes doesn’t show the whole call hierarchy and only yields a non-editable graph inside the doxygen files.
Are there any alternative tools, preferable open source (freeware)?
Thanks in advance and a good start into 2020 🙂
I typically use Understand when I want to do something like this. There are open source tools like Doxygen and Graphviz, but they are not fully functioned usually and as you mentioned they don’t allow you to easily navigate the whole hierarchy. The ones I’ve found that generate anything really useful are commercial products.
I’ve used SourceTrail for a year, https://www.sourcetrail.com/, and am very pleased with its capabilities. It is much more capable than Doxygen+Graphviz. It’s easy to search and zoom into various parts of the call chain.
The program was proprietary but is now released under an Open Source licens.
SourceTrail is nice, thanks. Netbeans also has a decent feature like that but maybe more limited. Right click on the function name and it has show call graph option.