How to Debug a Hard Fault on an Arm Cortex-M

ByJacob Beningo May 26, 2022June 22, 2022

In my opinion, one of the worst, most annoying faults to debug on an Arm Cortex-M microcontroller is a hard fault. If you are lucky, the hard fault appears after you’ve made some glaringly obvious mistake, and you can quickly undo it. I recently worked with a colleague who encountered a hard fault, but it was several commits deep, and I had no clue what could cause the fault. In this post, we’ll walk through the process I used to identify the cause and correct the hard fault.

An Imprecise Error

When a hard fault occurs, embedded developers have no choice but to dive into the depths of the microcontroller and examine the fault registers. The first register to examine on a deep dive is the Configurable Fault Status Register (CFSR). The CFSR is composed of three fault registers:

The MemManage Fault Status
The BusFault Status
The Usagefault Status

Together, these registers can help us start down the path to understanding why we have a fault.

Unfortunately, the values stored in these registers are not always conclusive or helpful, depending on the hard fault. For example, when I examined the value of the CFSR register, I discovered it was set to 0x400. Figure 1 below details what the bits in the CFSR mean. A value of 0x400 is an imprecise error!

Figure 1 – The high-level register definition for CFSR. See here for additional details.

An imprecise error is an asynchronous fault, a bus fault that is forced due to a priority issue, disabling the fault, a memory access issue, or so forth. The problem with an imprecise fault is that you can’t trust that the other fault registers contain any direct or valuable information about the cause of the fault! That’s right, at this point, you’re in for reverting code or guessing and randomly trying different Band-Aids to try and fix the problem.

From Imprecise to Precise Errors

Thankfully, when you encounter an imprecise error causing your hard fault, all is not lost. The imprecise error may be caused by the CPU using an internal buffer to cache instructions. If the buffer is disabled, every instruction executed will be executed linearly. The result will be that the imprecise error turns into a precise error, and all the other fault registers may help identify the fault.

The steps to disable the buffer is straight forward. Developers can disable the write buffer by setting DISDEFWBUF in the ACTLR register. The code to do this looks something like the following:

SCnSCB->ACTLR |= SCnSCB_ACTLR_DISDEFWBUF_Msk;

In addition to disabling the write buffer, it’s also a good idea to make sure that the Usage, Bus, and Memory faults are enabled in the SHCSR register. These faults can be enabled using the following C code snippet:

// Enable Usage-/Bus-/Mem Faults

SCB->SHCSR |= SCB_SHCSR_USGFAULTENA_Msk

| SCB_SHCSR_BUSFAULTENA_Msk

| SCB_SHCSR_MEMFAULTENA_Msk;

Compile the code, cross your fingers, and rerun the code. Hopefully, the imprecise error is now a precise error which allows us to dig much further into the cause of the hard fault. In my case, the CFSR register now reads 0x8200! We now have a precise error!

Debugging a Precise Error

Now that we have a precise error, we can examine the other bits in the CFSR register. In this case, the only other bit set is the BFARVALID bit. The BFARVALID bit tells us that the bus address stored in the BFAR register is a valid address and may tell us something about what has caused our fault. Initially, just by the BFARVALID bit being set, we can deduce that we have a bus fault causing our hard fault.

The BFAR register, the bus fault address register, in this case, holds a value of 0x100000. Interesting! Why is the processor faulting when the bus tries to access the address 0x100000? A quick investigation into the microcontroller memory map reveals that the memory address 0x100000 doesn’t exist! Flash memory, in this case, is from 0x0 to 0x100000. The processor should be throwing faults, but why is the compiler generating instructions outside the memory space?

A Memory Map Bug

Well, it turned out that my colleague was in the process of adding additional sections to the linker script. He was looking to add a section in memory for system configuration but unfortunately forgot that he had to resize the other flash sections. The result was that the new flash section was outside physical flash. The linker had the section specified and therefore didn’t care about putting accesses to this non-exist area of memory. The result was a hard fault caused by a precise bus error!

Conclusions

Troubleshooting hard faults on a microcontroller can be difficult if you don’t use the right process. In this post, we saw that developers could use the CFSR register to identify the cause of their hard fault. In a more complicated situation, developers might need to disable the CPU write buffer to change an imprecise error into a precise one. Once this is done, the time to identify the issue can be dramatically short. In total, this particular bus fault only took about 10 – 15 minutes from start to finish. However, it quickly could have taken me days. I hope this helps you quickly solve any future hard faults you or your team may encounter.

* * *

Struggling to keep your development skills up to date or facing outdated processes that slow down your team, raise costs, and impact product quality?

Here are 4 ways I can help you:

Embedded Software Academy: Enhance your skills, streamline your processes, and elevate your architecture. Join my academy for on-demand, hands-on workshops and cutting-edge development resources designed to transform your career and keep you ahead of the curve.
Consulting Services: Get personalized, expert guidance to streamline your development processes, boost efficiency, and achieve your project goals faster. Partner with us to unlock your team's full potential and drive innovation, ensuring your projects success.
Team Training and Development: Empower your team with the latest best practices in embedded software. Our expert-led training sessions will equip your team with the skills and knowledge to excel, innovate, and drive your projects to success.
Customized Design Solutions: Get design and development assistance to enhance efficiency, ensure robust testing, and streamline your development pipeline, driving your projects success.

Take action today to upgrade your skills, optimize your team, and achieve success.

Embedded Basics | Software | Uncategorized

Embedded Basics – Peculiarities of the keyword const
August 14, 2015

The keyword const in C can at best be a misleading type qualifier. One would think that const would specify…

Read More Embedded Basics – Peculiarities of the keyword const
Hardware | Uncategorized

Accelerating Embedded Software Development with NXP’s Application Code Hub
August 29, 2024September 24, 2024

When you’re working on an embedded software project, chances are high—up to 65%—that you’ll face delays and budget overruns. A…

Read More Accelerating Embedded Software Development with NXP’s Application Code Hub
Defect Management (Debugging) | Design Cycle | Embedded Basics | Optimization | Platforms | Safety | Security | Testing | Tools

DIY vs Buy: A Short Guide on Toolchain Validation for Functional Safety
September 4, 2025September 3, 2025

When most people think about functional safety, the first image that comes to mind is hardware redundancy: That stuff is…

Read More DIY vs Buy: A Short Guide on Toolchain Validation for Functional Safety
Platforms | Prototyping | RTOS | Software Architecture | Software Techniques | Zephyr

Mastering the Zephyr RTOS Devicetree and Overlays
December 4, 2025June 25, 2026

If you’ve been building embedded systems for a while, you’ve probably configured hardware the “traditional” way. Diving into header files,…

Read More Mastering the Zephyr RTOS Devicetree and Overlays
Embedded Basics | Software | Zephyr

Zephyr RTOS Sensor API: Unlock On-Board Temperature Sensing
March 5, 2026June 25, 2026

Every embedded product eventually needs to sense something. Temperature. Vibration. Pressure. Humidity. And every time, the story plays out the…

Read More Zephyr RTOS Sensor API: Unlock On-Board Temperature Sensing
Embedded Basics | Software | Software Techniques

Getting Started Writing Portable Firmware
August 10, 2017September 6, 2017

Developers who want to reuse software have several challenges to overcome in order to be successful. These challenges include but…

Read More Getting Started Writing Portable Firmware

12 Comments

Andrzej says:

June 22, 2022 at 2:20 pm

Thank you for the post! It opens eyes into a scenario (linker script not matching actual chip memory layout) which is hard to found very often…
From the other hand I would expect for some ARM lib like CMSIS to parse these registers and provide some user friendly error handler (stdlib based printout or simmilar) helping the user to get some clue right away… 🙂

Reply
1. Jacob Beningo says:
  
  June 22, 2022 at 5:30 pm
  
  Thanks for the comment. Unfortunately, in my experience, the only handler is a breakpoint in the hard fault handler. It would be great if the default handlers copied register values into a structure that allowed easy debugging, but I haven’t seen anyone do that. Maybe I should create an example for a future post …
  
  Reply
  1. Bogdan Baudis says:
    
    June 22, 2022 at 5:37 pm
    
    I should probably extract the code I wrote even for myself. One problem it is tied to the type of the Cortex and to the compiler/linker, as there is no easy generic way to deal with these … CMSIS would be the proper place to do it there … at least covering GNU, IAR and Keil … and at least for Cortex-M3, 4 and 7 … to get it right it requires a careaful reading of at least two ARM manuals (one for the core type and the other for the interrupt controller)
    
    Reply
    1. Jacob Beningo says:
      
      June 22, 2022 at 5:44 pm
      
      Agreed! There used to be a CMSIS working group that met at Embedded World. I have not been since COVID, but maybe I can pass that along to my Arm contacts and see if they implement it…
      
      Reply
      1. Bogdan Baudis says:
        
        June 23, 2022 at 10:32 am
        
        This would be great. Even if they settled on just providing them as examples. Personally I see no reason why they would not be a part of CMSIS.
        The way I understand it , it is already established that the ISV and related parts are to be modified by the user if so required.
        I am pretty sure this problem is being solved again and again by many people in many ways but it would be good to have some standard to start from.
Bogdan Baudis says:

June 22, 2022 at 2:47 pm

The other annoying thing is that so far as I know (unless that has changed lately) neither CMSIS nor development framework do not provide support or examples for the trap/faults handlers. At least 2 – 3 years ago that was the case when I had to write them from scratch.
The one feature we implemented proved very valuable: in the Release version traps/faults handlers would leave a small info in a RAM marked as not-initializable so on the following reset it could be read and at least displayed on the serial port (as the bugs would happy disappear when run under debugger!).
Also it is very handy to read and display the reset reason ASAP on the serial console!

Reply
1. Jacob Beningo says:
  
  June 22, 2022 at 5:34 pm
  
  Thanks for the comment. That has been my experience as well! Perhaps in a future post, I’ll give an example of how to do this. Thanks again!
  
  Reply
2. Vladimir Marchenko says:
  
  June 5, 2023 at 3:02 pm
  
  Hi Bogdan, agree, but gladly, the situation is improved now with CMSIS-View:Fault component that operates very similarly to what you describe: https://arm-software.github.io/CMSIS-View/latest/fault.html
  
  Reply
Boudewijn Dijkstra says:

June 24, 2022 at 9:41 am

Note that not all Cortex-M’s have the ACTLR, and when they have it, they might not have the DISDEFWBUF bit. It is IMPLEMENTATION DEFINED in ARMv7-M and absent in ARMv6-M. Cortex-M3 and M4 have the bit, but Cortex-M7 doesn’t.

Reply
1. prasad says:
  
  September 14, 2023 at 1:30 am
  
  in this Case how to set DISDEFWBUF bit.
  
  Reply
  1. Jacob Beningo says:
    
    December 1, 2023 at 7:45 pm
    
    Thanks for the question. You should be able to do something like:
    
    #include “core_cm4.h”
    
    SCnSCB->CCR |= SCB_CCR_DISDEFWBUF_Msk;
    
    That’s if you use CMSIS. You could also define your own and do something like:
    
    #define SCB_CCR (*(volatile uint32_t*)0xE000ED14) // Define the address of the CCR register
    #define SCB_CCR_DISDEFWBUF (1 << 1) // Define the bit position for DISDEFWBUF SCB_CCR |= SCB_CCR_DISDEFWBUF; // Set the DISDEFWBUF bit
    
    Reply
Yevgeni Tunik says:

June 29, 2022 at 2:30 am

One more real case, that analyzing of debug registers didn’t help:
Hard Fault rarely occurred at DMA interrupt handler of internal ADC during wait for STM32F427 internal flash ready. At the end, a consultant discovered the totally unexpected reason: “2.2.12 Data cache might be corrupted during Flash read-while-write operation”. See errata https://www.st.com/resource/en/errata_sheet/es0206-stm32f427437-and-stm32f429439-line-limitations-stmicroelectronics.pdf

Reply

Simplifying Concepts.
Accelerating Innovation.

How to Debug a Hard Fault on an Arm Cortex-M

An Imprecise Error

From Imprecise to Precise Errors

Debugging a Precise Error

A Memory Map Bug

Conclusions

Embedded Basics – Peculiarities of the keyword const

Accelerating Embedded Software Development with NXP’s Application Code Hub

DIY vs Buy: A Short Guide on Toolchain Validation for Functional Safety

Mastering the Zephyr RTOS Devicetree and Overlays

Zephyr RTOS Sensor API: Unlock On-Board Temperature Sensing

Getting Started Writing Portable Firmware

12 Comments

Leave a Reply Cancel reply

Contact Beningo Embedded Group

About

Insights

Latest Trends

Simplifying Concepts.Accelerating Innovation.

An Imprecise Error

From Imprecise to Precise Errors

Debugging a Precise Error

A Memory Map Bug

Conclusions

Similar Posts

12 Comments

Leave a Reply Cancel reply

Contact Beningo Embedded Group

About

Insights

Latest Trends

Simplifying Concepts.
Accelerating Innovation.