Implementing GCC Inline Assembly Within C Expressions
The asm keyword in GCC allows developers to embed raw assembly instructions directly inside C source files. This mechanism operates as a textual substitution layer; the compiler inserts the specified assembly string verbatim into the output object code without attempting to analyze or parse the internal semantics of the assembly logic.
There are two primary syntactic variations supported by the toolchain. The second variation introduces support for control flow management via goto. The fundamental structure typically requires an assembly template and at least one operand specification.
asm [qualifiers] (
"Assembler_Template_String"
: Output_Operand_List
[ : Input_Operand_List
[ : Clobber_Expressions ] ])
// Alternative form supporting branches
asm [qualifiers] (
"Assembler_Template_String"
:
: Input_Operand_List
: Clobber_Expressions
: Label_List);
Syntax Qualifiers
- volatile: Prevents the compiler from optimizing away or moving this instruction block across other operations.
- inline: Instructs the compiler to place the assembly code physically adjacent to the call site to minimize overhead.
- goto: Enables the inclusion of label references within the assembly template for branch targets.
Core Parameters
The operation relies on several distinct components:
- Assembler_Template
A string constant containing valid assembly mnemonics. This acts as a direct injection point. While editors may process backslashes for escapes, the compiler does not validate the assembly logic itself. Multi-line instructions within the string require separator characters like\n\t. Note that percent signs (%) acting as placeholders must be escaped as%%to avoid confusion with operand specifiers. - Output_Operands
A comma-separated list mapping C variables modified by the assembly block to their corresponding assembly placeholders. An empty list is permissible if the instruction produces no output. - Input_Operands
A list of C expressions providing values read-only by the assembly block. These can also be left empty. - Clobbers
A declaration of resources altered by the assembly code that fall outside the explicit Input/Output lists (e.g., unintended register modification or flag state changes). Register names and special keywords like"memory"are accepted here. - Goto_Labels
Required when using thegotoqualifier, listing C labels that the assembly code might jump to.
Operand Specifications
Each operand definition generally follows the pattern [name] constraint (variable).
Named vs. Numbered Identifiers
To improve code maintainability, symbolic names can be assigned to operands (e.g., %[my_var]). If omitted, the compiler assigns numeric indices starting from 0 for outputs and continuing sequentially for inputs.
int64_t dest_val;
int64_t src_val = 10;
// Using named placeholders for clarity
asm volatile("movq %[src], %[dest]"
: [dest] "=r"(dest_val)
: [src] "m"(src_val));
// Equivalent version using auto-generated indices
asm volatile("movq %1, %0"
: "=r"(dest_val)
: "m"(src_val));
Constraint types dictate how the compiler allocates storage. Common constraints include r (general-purpose registers) and m (memory locations).
Shared Registers
When an input value is intended to be stored into the same location as an output, the 0 identifier (matching the index of the output operand) is used. This guarantees aliasing between the input source and output destination.
int64_t accumulator = 3;
int64_t delta = 1;
// Both accumulator and delta share the same register allocation
asm volatile("addq %1, %0"
: "=g"(accumulator)
: "0"(delta));
Managing Side Effects
While the compiler tracks changes declared in the Output list, it cannot infer modifications caused by specific hardware behaviors. The Clobber section informs the compiler of these hidden dependencies.
Register Lists: Explicitly naming registers (e.g., %rax) tells the compiler to treat them as unavailable for other allocations during this block.
Special Tokens:
- "cc": Indicates that Condition Flags were modified.
- "memory": Tells the compiler that memory contents were accessed or altered unpredictably. This forces cache flushes or reloads of volatile data.
Common Constraint Classes
| Identifier | Description |
|---|---|
m |
Permits access to any addressable memory location. |
r |
Permits allocation into general-purpose CPU registers. |
i |
Requires a compile-time constant integer. |
n |
Allows immediate constants (often preferred over i for complex immediates). |
g |
Accepts any register, memory location, or immediate value. |
p |
Requires a valid effective memory address. |
= |
Prefix indicating a pure output operation (previous value discarded). |
Practical System Integrations
In high-level system programming, such as operating system kernels, inline assembly is often wrapped in macros to handle architecture-specific nuances.
Memory Ordering Guarantees
To enforce ordering of memory operations across cores, compilers use specialized barriers. On x86 architectures, this involves fence instructions.
#define cpu_mb() asm volatile("" ::: "memory")
// Conditional compilation based on word size
#ifdef CONFIG_X86_32
#define cpu_read_barrier() asm volatile("lock; addl $0, %%esp" ::: "memory", "cc")
#else
#define cpu_read_barrier() asm volatile("lfence" ::: "memory")
#endif
These definitions ensure that instructions following the macro do not reorder past preceding memory accesses.
Retrieving the Current Context
Operating systems frequently access thread-local metadata. A common idiom involves reading segment registers (specifically gs on x86-64) combined with a per-cpu offset.
// Extract task structure pointer from current execution context
static inline struct task_struct *fetch_current_context(void) {
struct task_struct *ctx;
// Reads from GS segment base plus global offset
asm volatile(
"movq %%gs:%P1, %0"
: "=r"(ctx)
: "p"(&global_task_ptr)
);
return ctx;
}
This technique demonstrates how to bind specific hardware addressing modes to C variables safely, ensuring the correct physical address is resolved into the local register before assignment.
Further Documentation
For detailed specifications, refer to the official GCC manual documentation on Extended ASM.