Embedded Systems C/C++ Fundamentals for Firmware Engineers
This guide distills essential C and C++ concepts critical for embedded firmware development—particularly on resource-constrained microcontrollers like STM32. It emphasizes deterministic behavior, memory awareness, and ABI-safe practices over generic language theory.
C vs. C++ in Embedded Context
- C++ introduces object-oriented abstractions (classes, inheritance, polymorphism), while C remains procedural and closer to hardware.
- In C++,
new/deletetrigger constructors/destructors; in C,malloc/freeperform raw memory allocation without initialization logic. - C++ supports references, function overloading, and templates—features rarely used in bare-metal code due to overhead or toolchain limitations. C relies on function pointers and macros for similar flexibility.
- When returning compound types: C++ may invoke copy/move constructors (subject to RVO/NRVO); C forbids returning structs by value in some ABIs—developers typically return pointers or pass output buffers as parameters.
Fixed-Width Integer Types & Portability
For embedded systems, avoid platform-dependent types like int or long. Prefer explicit-width types from <stdint.h>:
#include <stdint.h>
uint8_t sensor_id; // Always 8-bit unsigned
int32_t timestamp_ms; // Always 32-bit signed
uint64_t cycle_count; // Always 64-bit unsigned
These guarantee consistent size and signedness across compilers (e.g., ARM GCC vs. IAR) and architectures—critical for register mapping, protocol serialization, and DMA buffer alignment.
Memory Layout & Allocation Strategies
| Region | Allocation Timing | Typical Use | Embedded Considerations |
|---|---|---|---|
| Stack | Runtime (per-function) | Local variables, function call frames | Size is fixed at link time (e.g., 2–8 KiB). Overflow causes silent corruption. Avoid large stack arrays. |
| Heap | Runtime (dynamic) | Buffers whose size is unknown at compile time | Rarely used in safety-critical firmware. malloc may fragment memory or block indefinitely. Prefer static pools or custom allocators. |
| Data/BSS | Link time | Initialized/uninitialized globals and static variables |
Resides in RAM. BSS section is zero-initialized by startup code before main(). |
| Flash (RODATA) | Link time | String literals, const lookup tables | Use const + __attribute__((section(".rodata"))) for critical constants to prevent accidental RAM copies.` |
Key Keywords in Practice
volatile
Prevents compiler optimizations that assume memory doesn’t change externally:
// Hardware register mapped to address 0x40000000
#define GPIOA_ODR (*(volatile uint32_t*)0x40000000)
// Forces write to hardware—even if value appears unused
GPIOA_ODR = 0x01; // Output pin 0 high
GPIOA_ODR = 0x00; // Output pin 0 low
Without volatile, the compiler might optimize away the first assignment, breaking timing-sensitive sequences.
static
- At file scope: Limits symbol visibility to the current translation unit—prevents naming conflicts and enables aggressive optimization.
- At function scope: Allocates storage in BSS/data segment (not stack). Initialized once, persists across calls:
void debounce_button(void) {
static uint32_t last_press_ticks = 0;
static bool is_pressed = false;
uint32_t now = get_tick_count();
if (read_gpio_pin() && (now - last_press_ticks) > 20000) { // 20ms debounce
is_pressed = true;
last_press_ticks = now;
}
}
const
Enforces immutability and guides placement:
const uint8_t lookup_table[] = {0, 1, 4, 9, 16};→ Placed in flash (RODATA).static const int calibration_offset = 27;→ May be optimized into immediate operands.- Pointer qualifiers matter:
const uint8_t* ptr: data pointed to is read-only.uint8_t* const ptr: pointer itself is fixed (e.g., hardware register alias).const uint8_t* const ptr: both are immutable.
Structures, Unions & Memory Packing
Default padding ensures alignment but wastes space. For peripheral registers or network packets, enforce compact layout:
// Standard packed struct for CAN message header
#pragma pack(push, 1)
typedef struct {
uint32_t id : 29; // 29-bit identifier
uint8_t rtr : 1; // Remote transmission request
uint8_t ide : 1; // Identifier extension
uint8_t dlc : 4; // Data length code (0–8)
uint8_t reserved : 4;
} can_header_t;
#pragma pack(pop)
Equivalently with GCC/Clang attributes:
typedef struct __attribute__((packed)) {
uint32_t id : 29;
uint8_t rtr : 1;
uint8_t ide : 1;
uint8_t dlc : 4;
uint8_t reserved : 4;
} can_header_t;
Unions enable type punning for endianness detection or register field access:
union endian_test {
uint16_t word;
uint8_t bytes[2];
};
bool is_little_endian(void) {
union endian_test u = {.word = 0x0001};
return u.bytes[0] == 0x01; // LSB first
}
Pointer Arithmetic & Type Safety
Pointer arithmetic scales by pointed-to type size. Casts change interpretation:
uint32_t buffer[1024];
uint32_t* p32 = buffer;
uint8_t* p8 = (uint8_t*)buffer;
p32 += 2; // Advances 2 × sizeof(uint32_t) = 8 bytes → points to buffer[2]
p8 += 2; // Advances 2 × sizeof(uint8_t) = 2 bytes → points to byte offset 2
Use uintptr_t for integer-based address math to avoid undefined behavior on pointer overflow.
Safe String Handling in Constrained Environments
Avoid unsafe functions (strcpy, strcat, gets). Prefer bounded alternatives:
strncpy(dest, src, sizeof(dest)-1); dest[sizeof(dest)-1] = '\0';snprintf(buf, sizeof(buf), "Temp: %d.%d°C", deg, dec);- For known-size buffers:
memcpy(dest, src, n);(no null-termination needed)
Never use scanf in embedded firmware—unbounded input risks stack overflow. Parse manually or use sscanf with width specifiers.
Function Attributes for Critical Code
GNU extensions provide fine-grained control:
// Execute before main() — e.g., peripheral init
__attribute__((constructor)) void init_peripherals(void) {
RCC->AHB1ENR |= RCC_AHB1ENR_GPIOAEN;
GPIOA->MODER |= GPIO_MODER_MODER0_0;
}
// Never inline — preserve stack trace for debugging
__attribute__((noinline)) void hard_fault_handler(void) {
__BKPT(0); // Trigger debugger breakpoint
}
// Optimize for size, not speed
__attribute__((optimize("Os"))) uint32_t crc32_update(uint32_t crc, uint8_t byte);
Memory Management Pitfalls
- Memory leaks: Rare on bare metal (no OS heap), but possible with dynamic allocators. Track allocations manually or use pool-based allocators.
- Use-after-free: Eliminated by design—avoid
free()entirely in most MCU firmware. - Wild pointers: Initialize all pointers to
NULL; check before dereferencing:
uart_dev_t* uart = get_uart_instance(USART1);
if (uart != NULL) {
uart_write(uart, "Ready\n", 6);
}
Bit Manipulation Idioms
Efficient bit counting (population count) without loops:
// Count set bits in a 32-bit word (Brian Kernighan's algorithm)
static inline uint8_t popcount32(uint32_t x) {
uint8_t c = 0;
while (x) {
x &= x - 1; // Clear lowest set bit
c++;
}
return c;
}
Atomic bit-band access (Cortex-M3/M4): Map bit addresses directly to memory-mapped regions for single-cycle bit operations.
Preprocessor Best Practices
- Use
static inlinefunctions instead of#definefor type safety and debuggability. - Guard headers:
#ifndef MY_DRIVER_H,#define MY_DRIVER_H,#endif. - Conditional compilation for hardware variants:
#if defined(STM32F407xx)
#define ADC_MAX_CHANNELS 16
#elif defined(STM32L476xx)
#define ADC_MAX_CHANNELS 10
#endif
Standard I/O vs. Raw System Calls
In embedded contexts, avoid buffered stdio (fopen, printf) unless using semihosting (debug only). Prefer direct system calls or HAL abstractions:
write(STDOUT_FILENO, "OK\n", 3);— minimal syscall overhead- Implement custom
_write()forprintfredirection to UART. - Use
snprintf+ UART transmit for formatted output—neverprintfin ISRs.