Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Profiling Production Java Applications with Java Flight Recorder

Tech 1

Java Flight Recorder (JFR) is a low-overhead, production-ready profiling utility integrated directly into the OpenJDK and Oracle JDK distributions. It operates by intercepting JVM-level execution paths and capturing runtime telemetry as discrete events. These events encompass garbage collection cycles, thread scheduling, class loading, socket I/O, monitor contention, and heap allocations. The collected data is buffered in native memory and periodically flushed to disk, minimizing impact on the target application.

Event Collection Architecture

The recorder hooks into the HotSpot VM via a callback mechanism. When an event threshold is met or a periodic timer triggers, the VM serializes relevant state into a binary format. Because JFR uses a separate native thread for buffering and I/O, the aplication threads experiance negligible latency. Event types can be toggled dynamically, and custom application-specific events can be emitted using the jdk.jfr API without modifying core logic.

Enabling and Configuring Recordings

Starting with JDK 11 and backported to JDK 8u191+, JFR is available without commercial license restrictions. Recordings can be initiated at launch or attached to a running process using jcmd. The following configuration starts a continuous recording with a 30-minute disk duration, automatically dumping the file on application shutdown:

java \
  -XX:StartFlightRecording=filename=runtime_trace.jfr,\
  maxsize=500M,dumponexit=true,\
  settings=profile \
  com.example.WorkloadExecutor

To generate meaningful telemetry for analysis, the target application should perform representative operations. The sample below demonstrates a workload that triggers allocation, synchronization, and I/O events:

package com.example;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.locks.ReentrantLock;

public class WorkloadExecutor {

    private static final ReentrantLock processingLock = new ReentrantLock();

    public static void main(String[] args) throws InterruptedException {
        WorkloadExecutor engine = new WorkloadExecutor();
        engine.executeBatch();
    }

    public void executeBatch() throws InterruptedException {
        List<byte[]> allocationBuffer = new ArrayList<>();

        for (int cycle = 0; cycle < 50; cycle++) {
            processingLock.lock();
            try {
                simulateHeavyComputation(allocationBuffer, cycle);
                Thread.sleep(50);
            } finally {
                processingLock.unlock();
            }
        }
        System.out.println("Execution complete. Check generated JFR file.");
    }

    private void simulateHeavyComputation(List<byte[]> buffer, int iteration) {
        if (iteration % 5 == 0) {
            buffer.add(new byte[1024 * 1024 * 2]);
        }
        double result = 0.0;
        for (long i = 0; i < 5_000_000L; i++) {
            result += Math.sqrt(i);
        }
    }
}

Analyzing Telemetry Data

Once the JVM terminates or the recording is stopped manually via jcmd <pid> JFR.stop, the resulting .jfr file can be loaded into JDK Mission Control (JMC). The JMC interface visualizes event streams, correlates GC pauses with thread stack traces, and highlights lock contention hotspots. For automated pipelines, the jfr CLI tool or the jdk.jfr.consumer API can parse recordings to extract metrics and generate dashboards.

Key Operational Benefits

  • Negligible Overhead: Typically consumes less than 1% CPU, making continuous production profiling feasible.
  • Comprehensive Visibility: Captures cross-cutting concerns like safepoint delays, class unloading, and native library calls alongside standard application metrics.
  • Zero-Dependency Integration: Bundled with the JVM, requiring no external agents or bytecode instrumentation.

Practical Deployment Scenarios

  • Bottleneck Identification: Pinpointing slow database queries, excessive allocation patterns, or thread pool exhaustion.
  • Latency Investigation: Correlating response time spikes with garbage collection pauses or OS-level scheduling delays.
  • Capacity Planning: Measuring resource consumption trends over extended periods to inform infrastructure scaling decisions.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.