Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Linux Kernel Memory Reclamation Architecture

Tech 1

Allocation Triggers and Watermarks

When the memory allocator requests a physical page, it first queries against the low watermark. If this allocation fails, it indicates mild pressure; the system wakes the kswapd daemon to perform asynchronous reclamation, then retries using the minimum watermark. Should allocation still fail at this stage, critical shortage is declared, triggering synchronous direct reclaim.

Page Classification and Recovery Strategies

Memory pages are categorized based on storage support to determine the appropriate recovery method:

  • Swappable Pages: Includes anonymous pages and private copy-on-write mappings from tmpfs. The kernel transfers data to the swap area before freeing the physical frame.
  • File-Backed Pages: These reside on persistent storage. Clean pages (modified only in memory) can be discarded immediately. Dirty pages must be written back to the backing store before release.

Additionally, the slab allocator supports dynamic shrinking via registered shrinker callbacks invoked during the reclaim cycle.

LRU Selection Algorithm

The kernel utilizes the Least Recently Used (LRU) algorithm to identify candidates for eviction. This requires tracking which virtual addresses map to specific physical frames. To enable this unmapping process, reverse mapping structures connect physical pages to they corresponding Virtual Memory Areas (VMAs).

LRU Vectors and List Management

Each memory node (struct pglist_data) contains an LRU vector (lruvec). This structure holds five distinct lists:

  1. Inactive Anonymous: Low frequency access anonymous pages.
  2. Active Anonymous: High frequency access anonymous pages.
  3. Inactive File: Low frequency file cache pages.
  4. Active File: High frequency file cache pages.
  5. Unevictable: Pages locked via mlock that cannot be reclaimed.

State information is recorded within the flags of the page descriptor (struct page):

  • PG_lru: Indicates membership in an LRU list.
  • PG_swapbacked: Denotes swappable content.
  • PG_active: Marks the page as active.
  • PG_unevictable: Prevents reclaiming.

Pages are ordered by recency within these lists, with the head representing the most recently accessed entries. Eviction targets are taken from the tail of inactive lists. Active pages are demoted to inactive lists during scanning to age them out.

Activity Tracking

Access levels are detected via hardware page table bits for mapped regions or the PG_referenced flag set during filesystem I/O operations on cached files without direct mappings.

Reverse Mapping Structures

To remove a page from the page tables, the kernel must locate all references. Key members in struct page facilitate this:

struct page {
    // ... other fields ...
    union {
        struct address_space *mapping;
    };
    
    /* Offset within the object */
    pgoff_t index;
    
    /* Count of ptes mapped */
    atomic_t _mapcount;
    // ... other fields ...
};
  • mapping: Points to an inode or an anon_vma (indicated by a set flag bit).
  • index: Offset relative to the mapping base.
  • _mapcount: Stores the number of active mappings (starts at -1).

Anonymous Mapping Chains

Anonymous pages require special handling because they lack a backing file inode. An anon_vma instance organizes the VMAs sharing the page. A chain of anon_vma_chain nodes links multiple vm_area_struct instances (e.g., parent and child after fork) to the same anon_vma, enabling efficient lookup of all processes holding the reference.

Reclamation Execution Flow

Allocation attempts ultimately route through the reclamation subsystem. Whether initiated asynchronously via kswapd or synchronously due to severe shortage, the actual memory release logic converges on the shrink_node function to free pages belonging to a specific node.

Tags: linux-kernel

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.