Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

G1 Garbage Collector Internals and Practical GC Tuning Methodologies

Tech May 13 3

G1 Architecture Overview

The Garbage-First (G1) collector became the default in JDK 9, superseding CMS. It partitions the heap into a grid of equally-sized regions and targets pause times rather than maximizing throughput alone.

Operational Profile

  • Balances throughput with predictable pause times, defaulting to a 200 ms target
  • Handles very large heaps by treating memory as a collection of regions
  • Employs a global marking-compact strategy while using copying semantics between individual regions

Key Configuration Flags

-XX:+UseG1GC                  # Enable G1 (required on JDK 7/8)
-XX:G1HeapRegionSize=<size>   # Region granularity
-XX:MaxGCPauseMillis=<ms>     # Pause time objective

The G1 Collection Cycle

G1 operates through three repeating macro-phases.

Young-Only Collections When Eden regions fill, G1 triggers a stop-the-world young collection. Live objects from Eden and the current survivor space are copied to fresh survivor regions. Objects that exceed the tenure threshold relocate to old generation regions, while younger survivors swap between survivor spaces.

Young Collecsion with Concurrent Marking As the old generation occupancy crosses a threshold, G1 initiates a concurrent marking cycle. The young collection that crosses this boundary performs the initial root marking (STW), after which marking continues concurrently with the application. The threshold is governed by:

-XX:InitiatingHeapOccupancyPercent=<percent>  # Default 45

Mixed Collections Once marking completes, G1 schedules mixed evacuation cycles that reclaim Eden, survivor, and selected old regions. Rather than collecting the entire old generation, G1 ranks regions by reclaimable garbage density and evacuates the highest-yield candidates first to respect the pause goal.

A mixed cycle includes:

  • Remark (STW): Finalizes the marking set and resolves objects that changed during concurrent marking
  • Evacuation (STW): Copies live data to new regions and reclaims source regions

GC Semantics Across Collectors

The distinction between Minor GC and Full GC varies by collector:

  • Serial / Parallel: Young-space exhaustion triggers Minor GC; old-space exhaustion triggers Full GC.
  • CMS / G1: Young-space exhaustion triggers Minor GC. Old-space exhaustion behaves differently:
    • G1: If reclamation outpaces allocation, concurrent old-gen collection occurs. If allocation outpaces reclamation, G1 falls back to a serial Full GC with a lengthy pause.
    • CMS: Similar concurrent-vs-fallback behavior applies.

Remembered Sets and Cross-Region References

Scanning the entire old generation for roots during every young collection would be prohibitively expensive. G1 accelerates root scanning with a remembered-set abstraction.

The heap is covered by a card table (typically 512-byte granularity). When an old-generation object mutates to reference a young-generation object, a post-write barrier logs the dirty card into a buffer. Concurrent refinement threads asynchronously update per-region remembered sets. During young collection, G1 inspects only the dirty-card regions and the remembered sets rather then the full old generation.

The Remark Phase and Snapshot-At-The-Beginning

During concurrent marking, application threads may revive objects previously thought unreachable. To prevent the collector from dropping live data, G1 uses a Snapshot-At-The-Beginning (SATB) mechanism.

When a reference is overwritten, a pre-write barrier captures the original reference into a SATB mark queue and shades the object gray. During the remark pause, G1 drains these queues and re-scans the affected objects, ensuring no floating garbage is mistakenly reclaimed.

G1 Optimization Features

String Deduplication G1 can deduplicate identical char[] arrays backing String objects. Unlike String.intern(), which deduplicates String instances, this optimization shares the underlying character array. It consumes additional CPU during young collections but reduces heap footprint.

-XX:+UseStringDeduplication

Concurrent Class Unloading After a complete concurrent mark, classes whose loaders are unreachable are unloaded automatically. This is enabled by default:

-XX:+ClassUnloadingWithConcurrentMark

Humongous Object Handling Objects larger than half a region are allocated as humongous and are not copied during evacuation. G1 tracks incoming references from old regions; when no old-region references exist, a humongous object can be reclaimed during young collection.

Initiating Heap Occupancy Tuning Concurrent marking must finish before the heap is fully consumed, otherwise G1 falls back to Full GC. JDK 8 requires a static threshold. JDK 9+ introduces adaptive IHOP that samples allocation rates and maintains a safety buffer.

GC Tuning Fundamentals

Prerequisites Familiarize yourself with relevant VM flags and diagnostic utilities such as jconsole, jmap, and unified GC logging. Tuning is workload-dependent; no universal recipe exists.

Defining Objectives

  • Low-latency workloads: prefer CMS, G1, or ZGC
  • Batch / scientific workloads: prefer ParallelGC

Eliminate Non-GC Bottlenecks First

  • Reduce result-set sizes from databases and services
  • Slim object graphs: avoid boxed primitives when scalar types suffice, and fetch only required fields
  • Eliminate memory leaks (unbounded caches, static collections). Replace with bounded soft-reference structures or off-heap stores such as Redis

Young Generation Sizing Object allocation in Eden is cheap thanks to TLABs. Because most objects die young, Minor GC is far cheaper than Full GC.

A larger young generation reduces promotion but starves the old generation, which can trigger Full GC. Oracle recommends keeping young generation between 25% and 50% of the total heap. Size Eden to comfortably hold concurrent request-response working sets.

Survivor Space Configuration Survivor regions must be large enough to retain all concurrently live objects plus objects expected to promote. Set a practical tenure threshold so long-lived objects promote quickly rather than oscillating in survivor spaces.

-XX:MaxTenuringThreshold=<n>
-XX:+PrintTenuringDistribution

Old Generation and CMS Considerations

  • A larger old generation benefits CMS by leaving headroom for floating garbage
  • If logs show no Full GC, the old generation is unlikely to be the bottleneck; optimize young generation first
  • When Full GC does occur, compare the old-gen occupancy and increase capacity by 25–33%
-XX:CMSInitiatingOccupancyFraction=<percent>

Practical Tuning Scenarios

Scenario 1: Frequent Minor and Full GC Small young generations force short-lived objects to promote prematurely. These objects clutter the old generation and trigger Full GC.

Remediation: Increase young generation capacity and raise the tenure threshold to keep transient data in the young generation longer.

Scenario 2: Prolonged Pauses During Peak Traffic (CMS) High allocation rates during traffic spikes expand the remark phase because the collector must scan both generations.

Remediation: Enable pre-remark young collection to reduce the scan set:

-XX:+CMSScavengeBeforeRemark

Also ensure sufficient old-gen headroom:

-XX:CMSInitiatingOccupancyFraction=75

Scenario 3: Full GC with Sufficient Old Space (CMS on JDK 7) JDK 7 locates the permanent generation inside the Java heap. When PermGen fills, the JVM triggers Full GC even if the old generation has ample free space.

Remediation: Increase permanent generation bounds:

-XX:PermSize=<size>
-XX:MaxPermSize=<size>

(Note: JDK 8 replaced PermGen with metaspace, decoupling class metadata from heap collections.)

Tags: G1 GC

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.