Object Instantiation, Memory Layout, and Reference Access in the JVM
Object Instantiation in the JVM
Object Creation Methods
Interview Questions from Major Tech Companies
Tencent:
- How does an object get stored in the JVM?
- What information resides in the object header?
ByteDance:
- What components make up the object header in Java?
Six Ways to Create Objects
-
new operator: The most common approach. Also used when calling static factory methods like
getInstance()in singleton patterns or static factory classes. -
Class.newInstance(): Deprecated in Java 9. This method can only invoke the no-arg constructor and requires the constructor to be public.
-
Constructor.newInstance(): The preferred reflection-based approach. Works with both parameterless and parameterized constructors.
-
clone(): Creates an object without invoking any constructor. The class must implement the
Cloneableinterface and override theclone()method. -
Deserialization: Reconstructs an object from binary data read from files or network streams. Common used for socket communication and persistent storage.
-
Objenesis: A third-party library that bypasses constructor invocation entirely, useful for creating objects for testing purposes.
The Object Creation Proces
Examining Object Creation Through Bytecode
public class ObjectCreationDemo {
public static void main(String[] args) {
Object instance = new Object();
}
}
The corresponding bytecode:
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=2, args_size=1
0: new #2 // class java/lang/Object
3: dup
4: invokespecial #1 // Method java/lang/Object."<init>":()V
7: astore_1
8: return
Step 1: Verify Class Loading
When the JVM encounters a new instruction, it performs the following checks:
- Checks weather the symbolic reference to the class exists in the constant pool of the method area (Metaspace).
- Verifies if the class has been loaded, parsed, and initialized.
- If the class hasn't been loaded, the classloader traverses up the delegation hierarchy (parent-first loading model) to locate the
.classfile using the classloader, package name, and class name as the lookup key. - If the file isn't found, a
ClassNotFoundExceptionis thrown. Otherwise, the class is loaded and aClassobject is created.
Step 2: Allocate Memory
After loading the class, the JVM calculates the object's memory footprint:
-
Instance variables that are references only require 4 bytes of pointer space.
-
Contiguous Memory - Bump the Pointer: When heap memory is compacted, used memory occupies one region and free memory occupies another, with a boundary pointer between them. Allocating memory simply moves this pointer by the object's size. This approach is used with collectors featuring compaction: Serial, ParNew, and CMS with compaction enabled.
-
Fragmented Memory - Free List: When memory is scattered due to previous allocations, the JVM maintains a list tracking available blocks. During allocation, it searches for a block large enough and updates the list. This method is mandatory with non-compacting collectors like G1's cleanup phase or after mark-sweep operations.
Step 3: Handle Concurrency
Concurrent memory allocation requires synchronization:
- CAS with Retry: Compare-and-swap operations with automatic retry on collision.
- TLAB (Thread-Local Allocation Buffer): Each thread receives a pre-allocated Eden region. Enable with
-XX:+UseTLAB. This provides thread-local allocation without contention.
Step 4: Initialize Allocated Memory
All fields receive default values:
- Objects:
null - Numeric types:
0,0.0,false - Characters:
\u0000
Field Assignment Sequence:
- Default value initialization
- Explicit initialization or instance initializer block (execution order follows source code)
- Constructor initialization
Step 5: Configure Object Header
The object header stores:
- Metadata about the object's class
- Hash code (lazily computed)
- Garbage collection information
- Synchronization state (biased lock, lightweight lock, or heavyweight lock)
Header format varies between 32-bit and 64-bit JVMs and with compressed class pointers enabled.
Step 6: Execute Initialization
From the Java perspective, this is when the object truly comes into existence. The JVM:
- Executes instance initializer blocks
- Initializes fields with explicit values
- Invokes the constructor
- Assigns the object's heap address to the reference variable
The init() method executes after new, following the bytecode sequence established by the compiler.
Examining init() Through Bytecode
public class Person {
int age = 25;
String occupation;
Address homeAddress;
{
occupation = "Software Engineer";
}
public Person() {
homeAddress = new Address();
}
}
class Address {
}
Person.class bytecode:
0 aload_0
1 invokespecial #1 <java/lang/Object.<init>>
4 aload_0
5 bipush 25
7 putfield #2 <com/example/Person.age>
10 aload_0
11 ldc #3 <Software Engineer>
13 putfield #4 <com/example/Person.occupation>
16 aload_0
17 new #5 <com/example/Address>
20 dup
21 invokespecial #6 <com/example/Address.<init>>
24 putfield #7 <com/example/Person.homeAddress>
27 return
The init() method performs:
- Field default initialization
- Explicit assignment:
age = 25 - Instance initializer block:
occupation = "Software Engineer" - Constructor body:
homeAddress = new Address()
Object Memory Layout
The HotSpot JVM organizes object memory as follows:
+------------------+---------------------------------+
| Object Header | Instance Data Area |
+------------------+---------------------------------+
| Mark Word (8B) | |
| Class Pointer | int age = 25; |
| (compressed) | String occupation; |
| Array Length | Address homeAddress; |
| (if applicable) | |
+------------------+---------------------------------+
| Alignment Padding |
+---------------------------------+
Complete Example:
public class Product {
int productId = 100;
String productName;
double price;
{
productName = "Default Product";
}
public Product() {
price = 0.0;
}
public static void main(String[] args) {
Product item = new Product();
}
}
class Address {
}
The resulting memory layout contains the object header, instance fields, and potential alignment padding to ensure 8-byte boundaries.
Object Access
How JVM Accesses Objects Through References
JVM locals contain references that must resolve to heap objects and their class metadata in Metaspace. Two access strategies exist.
Strategy 1: Handle-Based Access
Stack Frame Heap Metaspace
+--------------+ +------------------+ +---------------+
| Reference | | Handle | | Class |
| (pointer) |--+> +--------------+ | | Metadata |
+--------------+ | | Object Data |-+--+> |
| +--------------+ | +---------------+
| | Class Ref | |
| +--------------+ |
+------------------+
Advantages:
- Reference remains stable when objects relocate during garbage collection.
- Only the handle's internal pointer updates; no changes to stack references.
Disadvantages:
- Extra heap allocation for handle storage.
- Two memory indirection steps to reach object data.
Strategy 2: Direct Pointer Access (HotSpot Implementation)
Stack Frame Heap Metaspace
+--------------+ +------------------+ +---------------+
| Reference | | Object | | Class |
| (direct) |--+> +--------------+ | | Metadata |
+--------------+ | | Object Header |-+--+> |
| +--------------+ | +---------------+
| | Instance Data| |
| +--------------+ |
+------------------+
Advantages:
- Single pointer dereference for direct access.
- Superior performance due to reduced indirection.
Disadvantages:
- Requires updating stack references when objects move during compaction.
- Modern generational collectors minimize such moves, making this less problematic.