Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Understanding C Language Structures and Memory Layout

Tech 2

Declaring and Initializing Custom Types

A structure aggregates heterogeneous data elements in to a single logical entity. Each component within the aggregate is referred to as a field or member, and members can vary in type.

struct DataTypeName
{
    member_type_1 member_name_1;
    member_type_2 member_name_2;
    /* additional members */
};

Instantiation and Initialization

Variables can be declared either alongside the type definition or separately. Initialization follows standard C aggregate rules, supporting both positional and desiganted syntax.

#include <stdio.h>

struct Publication {
    char title[50];
    char publisher[30];
    double cost;
} manual_ref_1, manual_ref_2;

int main(void) {
    struct Publication text_a = { "Systems Design", "TechPress", 45.50 };
    
    struct Publication text_b = { 
        .cost = 39.99, 
        .publisher = "CodeHouse", 
        .title = "Algorithm Basics" 
    };

    printf("%s - %.2f\n", text_a.title, text_a.cost);
    printf("%s - %.2f\n", text_b.title, text_b.cost);
    return 0;
}

Anonymous Definitions

Omitting the tag creates an anonymous structure. Such definitions restrict instantiation to the declaration point itself, preventing reuse elsewhere in the source file unless a typedef alias is applied.

Self-Referencing Structures

Complex data models like linked lists require nodes that reference their own type. To achieve this, the structure must be named; an anonymous definition cannot reference itself during declaration.

struct ListNode {
    int payload;
    struct ListNode* next_node;
};

typedef struct ListNode NodeAlias;

Memory Alignment Principles

Compilers insert padding bytes between members to satisfy architectural alignment constraints.

Alignment Rules

  1. The initial member always starts at an offset of zero relative to the structure's base address.
  2. Subsequent members align to addresses that are multiples of their specific alignment requirement. This requirement is calculated as the smaller of the compiler's default alignment value and the member's intrinsic size.
  3. Visual Studio defaults to an 8-byte boundary, while gcc typically aligns members strictly to their own size.
  4. The total structure size must be a multiple of the largest alignment requirement among all its members.
  5. Nested structures align based on their most restrictive internal member, and the outer structure's total size expands to satisfy the maximum alignment across the entire hierarchy.

Demonstrating offset and size calculations:

#include <stdio.h>
#include <stddef.h>

struct LayoutCompact {
    char status_flag;
    char mode;
    int identifier;
};

struct LayoutExpanded {
    char status_flag;
    int identifier;
    char mode;
};

int main(void) {
    printf("Compact offsets: %zu %zu %zu\n", 
           offsetof(struct LayoutCompact, status_flag), 
           offsetof(struct LayoutCompact, mode), 
           offsetof(struct LayoutCompact, identifier));
           
    printf("Expanded size: %zu\n", sizeof(struct LayoutExpanded));
    return 0;
}

Rationale for Alignment

  • Hardware Constraints: Many processor architectures enforce strict memory access rules. Attempting to fetch misaligned multi-byte data can trigger bus faults or hardware exceptions.
  • Execution Efficiency: CPUs fetch memory in fixed-width chunks (e.g., 32 or 64 bits). Aligned data resides within a single fetch cycle. Misaligned data often spans two chunks, requiring multiple memory transactions and bit-shifting operations. Padding trades storage space for reduced enstruction cycles.

Adjusting Default Alignment

Compilers provide preprocessor directives to override packing behavior, typically using #pragma pack(n). This forces the maximum alignment boundary to n, reducing padding at the cost of potential performance degradation on specific hardware.

Parameter Passing Strategies

Structures can be passed to functions either by value or by reference (pointer).

  • Pass by Value: Copies the entire structure onto the stack. Suitable for tiny aggregates but causes severe stack pressure and performance penalties for large datasets.
  • Pass by Pointer: Transmits only the memory address (typically 8 bytes on 64-bit systems). This avoids copying overhead and allows direct modification of the original data.
#include <stdio.h>

struct DataBuffer {
    int samples[512];
    int count;
    char priority;
};

void render_pointer(const struct DataBuffer *ptr) {
    for (int k = 0; k < ptr->count; ++k) {
        printf("%d ", ptr->samples[k]);
    }
    putchar('\n');
}

int main(void) {
    struct DataBuffer input = { {10, 20, 30}, 3, 'A' };
    render_pointer(&input);
    return 0;
}

Implementing Bit-fields

Bit-fields allow precise control over memory consumption by packing multiple logical values into a single integer container.

Syntax and Definition

Field declarations resemble standard structures but append a colon and a bit-width to the member name. Supported base types are typically signed or unsigned integers.

#include <stdio.h>

struct NetworkPacket {
    unsigned int version : 4;
    unsigned int type : 4;
    unsigned int length : 8;
    unsigned int flags : 16;
};

int main(void) {
    struct NetworkPacket pkt = {0};
    unsigned int temp_val = 0;
    printf("Packet size: %zu bytes\n", sizeof(struct NetworkPacket));
    
    // scanf("%u", &pkt.type); // Invalid: address-of operator prohibited
    scanf("%u", &temp_val);
    pkt.type = temp_val;
    return 0;
}

Memory Allocation Behavior

The compiler packs sequential bit-fields into storage units (usually int). When a field exceeds the remaining bits in the current unit, allocation typically shifts to the next unit. Exact layout depends entirely on the implementation.

Cross-Platform Inconsistencies

Bit-field behavior is explicitly undefined in the C standard regarding several factors:

  1. Signed Representation: Whether int fields default to two's complement, sign-magnitude, or ones' complement.
  2. Maximum Width: Whether a single field can span across storage unit boundaries or is capped at the size of the base type (e.g., 16 vs 32 bits).
  3. Bit Ordering: The sequence in which bits are assigned within a byte (LSB to MSB vs. MSB to LSB).
  4. Unit Transition Strategy: Weather padding bits are left unused when a field doesn't perfectly fit the remaining space.

Operational Restrictions

  • Bit-field members lack independent memory addresses because they share storage units with adjacent fields.
  • The address-of operator (&) cannot be applied to them, preventing direct usage with I/O functions like scanf or fread. Data must be read into a temporary scalar variable and subsequently assigned to the field.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.