Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

JVM Garbage Collection Mechanisms and Algorithms

Tech Jun 23 3

In the Java Virtual Machine, most object instances reside in the heap memory. Before performing garbage collection, the JVM must distinguish between live objects and dead objects. Only objects marked as dead can have their memory space reclaimed during garbage collection. This identification process is known as the garbage marking phase.

An object is considered dead when it's no longer referenced by any live objects. The JVM typically uses two primary methods to determine object liveness: reference counting and reachability analysis.

The reference counting algorithm is straightforward. Each object maintains an integer counter that tracks how many references point to it. When a new reference to an object is created, its counter increments by 1. When a reference is removed, the counter decrements by 1. When the counter reaches zero, the object is considered unreachable and can be reclaimed.

Advantages:

  • Simple implementation
  • Easy to identify garbage objects
  • High efficiency with no collection delay

Disadvantages:

  • Requires additional storage space for counters
  • Overhead from increment/decrement operations
  • Cannot handle circular references, which is a critical flaw

Java does not use reference counting due to its inability to handle circular references. The following code demonstrates this limitation:


/**
 * Demonstrating that Java doesn't use reference counting
 */
public class ReferenceCountingDemo {
    private byte[] memoryBlock = new byte[5 * 1024 * 1024]; // 5MB

    ReferenceCountingDemo otherRef = null;

    public static void main(String[] args) {
        ReferenceCountingDemo obj1 = new ReferenceCountingDemo();
        ReferenceCountingDemo obj2 = new ReferenceCountingDemo();

        obj1.otherRef = obj2;
        obj2.otherRef = obj1;

        obj1 = null;
        obj2 = null;
        
        // Force garbage collection
        System.gc();

        try {
            Thread.sleep(1000000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

Despite the circular references between obj1 and obj2, the JVM successfully reclaims these objects when System.gc() is called, proving that Java doesn't use reference counting.

Also known as root search or tracing garbage collection, the reachability analysis algorithm effectively solves the circular reference problem that plagues reference counting. This algorithm is used in Java and other modern languages.

The algorithm starts from a set of root objects (GC Roots) and traverses the object graph to determine which objects are reachable. Objects without any reference chain from GC Roots are considered unreachable and can be reclaimed.

GC Roots in Java

GC Roots include:

  1. Objects referenced in the virtual machine stack (method parameters, local variables)
  2. Objects referenced in the native method stack
  3. Static properties in the method area
  4. Constants in the method area (e.g., string pool references)
  5. Objects held by synchronization locks
  6. JVM internal references (Class objects, exceptions, system class loader)
  7. JMX beans, JVMTI callbacks, local code cache
  8. Temporary objects based on specific garbage collectors and memory regions

Important Note: Reachability analysis must be performed on a consistent snapshot of the memory. This is one reason why garbage collection often requires "Stop The World" pauses, even in collectors designed for minimal pauses like CMS.

Java provides a finalization mechanism that allows developers to define custom cleanup logic before an object is destroyed. The garbage collector invokes the finalize() method of an object before reclaiming it.

Key Points about finalize():

  • Should never be explicitly called by developers
  • May cause object resurrection
  • Execution timing is not guaranteed
  • Poorly implemented finalize() methods can significant impact GC performance

Object Lifecycle States:

  • Reachable: The object can be accessed from GC Roots
  • Revivable: The object has no references but might resurrect in finalize()
  • Unreachable: The object's finalize() has been called without resurrection

The following code demonstrates object resurrection through finalize():


/**
 * Demonstrating object resurrection through finalization
 */
public class ResurrectionDemo {
    public static ResurrectionDemo instance; // Class variable, a GC Root

    @Override
    protected void finalize() throws Throwable {
        super.finalize();
        System.out.println("Finalize method called");
        instance = this; // Resurrect the object
    }

    public static void main(String[] args) {
        instance = new ResurrectionDemo();
        instance = null;
        System.gc();
        
        try {
            Thread.sleep(2000); // Wait for Finalizer thread
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        
        if (instance == null) {
            System.out.println("Object was reclaimed");
        } else {
            System.out.println("Object was resurrected");
        }
    }
}

Mark-Sweep Algorithm

The mark-sweep algorithm involves two phases:

  1. Mark: Traverse from GC Roots, marking all reachable objects
  2. Sweep: Iterate through memory, reclaiming unmarked objects

Advantages:

  • Simple implementation
  • No memory overhead beyond the marking process

Disadvantages:

  • Can be inefficient
  • Requires application pause
  • Creates memory fragments

Copying Algorithm

The copying algorithm divides memory into two spaces. It uses one space at a time and copies live objects to the other space during collection, then swaps the spaces.

Advantages:

  • Simple and efficient
  • Compacts memory, eliminating fragmentation

Disadvantages:

  • Requires double memory space
  • Copying overhead when many objects survive

This algorithm is particularly effective for the young generation, where most objects are short-lived.

Mark-Compact Algorithm

The mark-compact algorithm combines mark-sweep with memory compaction:

  1. Mark: Same as mark-sweep
  2. Compact: Move all live objects to one end of memory
  3. Sweep: Clear the remaining space

Advantages:

  • Eliminates memory fragmentation
  • Enables pointer allocation (bump-the-pointer)
  • No memory halving like copying algorithm

Disadvantages:

  • Less efficient than copying
  • Object relocation requires updating references
  • Requires application pause

Generational collection leverages the observation that objects have varying lifespans. The heap is divided into:

  • Young Generation:
    • Smaller area with short-lived objects
    • High frequency of collection
    • Uses copying algorithm for efficiency
  • Old Generation:
    • Larger area with long-lived objects
    • Infrequent collection
    • Uses mark-sweep or mark-compact

This approach optimizes collection by applying the most appropriate algorithm to each generation.

Incremental Collection

To minimize application pauses, incremental collection alternates between garbage collection and application threads. The collector processes small portions of memory at a time, reducing pause duration but potentially increasing overall collection time due to context switching.

Partitioned Collection (G1 Garbage Collector)

The G1 collector divides the heap into multiple independant regions. Instead of collecting the entire heap at once, G1 targets specific regions based on collection goals, allowing for more predictable pause times. This approach provides better control over latency compared to traditional collectors.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.