JVM Architecture and Garbage Collection Types
What is JVM?
JVM stands for Java Virtual Machine.
JVM Components
Note: Before Java 8, the constant pool was located in the permanent generation within the heap. After Java 8, with the removal of the permanent generation and adoption of metaspace, the constant pool is now in the method area.
Program Counter (PC Register)
Similar to the PC register in operating systems, it points to the address of the next instruction to be executed.
Method Area
Stores static variables, constants, class information, and the runtime constant pool.
Native Method Stack
Primarily holds native interfaces used to call C or C++ libraries, often declared with the native keyword in Java programs.
For example, in the Thread class, starting a thread and yielding require native interface calls, which are registered in the native method stack.
public static native void yield();
private native void start0();
Virtual Machine Stack
Mainly stores method information, such as local variable tables, operand stacks, dynamic linking, and method return addresses.
Each method's execution from call to completion corresponds to a stack frame being pushed onto and popped from the virtual machine stack.
In an active thread, only the top stack frame is valid, known as the current stack frame, and its associated method is the current method.
When a thread ends, the stack memory is released, so there is no memory overflow issue with the stack.
Heap
The heap is the largest structure, primarily used to store object instances. Generally, objects are stored in the heap, but Java objects may not always be allocated on the heap due to JVM's escape analysis, which determines if an object should be allocated on the heap based on its usage scope.
If an object is created and used internally, it is non-escaping.
Non-escaping objects are allocated on the stack, and through scalar replacement, they are converted into primitive data types stored in the method's local variables.
Escape analysis also checks if an object is used only in a single thread, eliminating the need for synchronization and reducing synchronization overhead.
If the heap overflows, an OOM (Out Of Memory) exception is thrown.
JVM heap memory is divided into two parts: the young generation and the old generation.
Infact, there was also a permanent generation, where objects that survived multiple Major GCs were moved. However, this existed only in JDK 1.7 and earlier versions; it was later replaced by metaspace, which is not detailed here.
Young Generation
The young generation is where all new objects are created.
When the young generation memory space is full, garbage collection is triggered, known as Minor GC.
The young generation can be further divided into three parts:
- Eden space
- Survivor space (From space)
- Survivor space (To space)
Note: The survivor spaces alternate, with one full and one empty at any time, and the empty one is always the To space.
This corresponds to the copying algorithm used in garbage collection.
Key points about the young generation space:
- Most newly created objects are located in the Eden space.
- When the Eden space is filled with objects, Minor GC is executed, and all surviving objects are moved to one of the survivor spaces, leaving the Eden space empty after GC.
- Minor GC also checks surviving objects and moves them to the other survivor space. Thus, over time, one survivor space is always empty.
- After multiple GC cycles, objects that still survive are moved to the old generation memory space. This is typically done by setting an age threshold, usually 15, meaning if an object survives 15 GCs without being collected, it moves to the old generation. This value can be set via
-XX:MaxTenuringThreshold, with a default of 15.
Old Generation
The old generation memory contains long-lived objects and those that have survived multiple Minor GCs.
Garbage collection in the old generation usually occurs when its memory is full.
Garbage Collection Types
Minor GC
Garbage collection in the young generation is called Minor GC. Since most Java objects have short lifespans, Minor GC is very frequent and generally fast.
Major GC
Garbage collection in the old generation is called Major GC. Major GC is often equivalent to Full GC, collecting the entire GC heap.
Full GC
Full GC is clearly defined as a global GC covering the entire young generation, old generation, and metaspace (which replaced perm gen in Java 8 and later).
CMS and G1 are specific garbage collectors; details can be found in related resources.
Using Tools to Diagnose OOM Exceptions
Enter jvisualvm in the command line to use the JVM's built-in analysis tool.
In this software, go to the top-left toolbar -> Tools -> Plugins to download the Visual GC plugin.
After downloading, ensure it is activated; it may activate automatically.
Once installed, it can be used as a useful graphical interface.
Simulating a heap overflow program:
public class HeapOverflowSimulator {
public static void main(String[] args) throws InterruptedException {
List<byte[]> dataList = new ArrayList<>();
final int BYTES_PER_KB = 1024;
int chunkSize = BYTES_PER_KB * BYTES_PER_KB * 8;
for (int iteration = 0; iteration < BYTES_PER_KB; iteration++) {
System.out.println("Writing " + (iteration + 1) + " MB");
TimeUnit.MILLISECONDS.sleep(200);
dataList.add(new byte[chunkSize]);
}
}
}
After running, a new Java proces appears on the left.
The console displays memory write information.
Similarly, heap usage can be seen increasing.
Eventually, it reaches peak heap usage, resulting in an OOM exception, program termination, and memory release, causing a sharp drop.
During program execution, heap dumps can be taken.
Specific class information can be viewed in the dumps.