Diagnosing High CPU Usage in Java Applications with Alibaba's Arthas and Useful Scripts
When production Java applications exhibit elevated CPU consumption, identifying the root cause demands a systematic approach. This guide demonstrates resolution strategies using both traditional JDK utilities and modern diagnostic tools.
Converting Thread IDs to Hexadecimal Format
JDK stack traces display thread IDs in hexadecimal notation. Convert decimal PIDs using printf:
printf '%x\n' 406 405 375 376
Output:
194
195
177
178
Analyzing Thread Stacks with jstack
The jstack utility captures thread dump information for a given process.
Command Structure:
jstack [process_pid] | grep [hex_thread_id] -A10
Practical Application:
jstack 373 | grep '0x194' -A10
This reveals the stack trace for thread 0x194. In many cases involving high CPU usage, garbage collection threads appear prominently—denoted as "GC task thread#0 (ParallelGC)". These threads execute background garbage collection operations.
Persisting Stack Dumps for Bulk Analysis
For comprehensive analysis across multiple threads, persist the stack dump to file:
jstack 373 > demo.dump
Search through the saved dump using grep:
cat -n demo.dump | grep -A10 '0x194'
This approach isolates specific thread activity and identifies method call chains consuming CPU cycles.
Examining Garbage Collection Behavior
Frequent minor or major garbage collection events often correlate with elevated CPU consumption. The jstat utility provides JVM garbage collection statistics.
Command Syntax:
jstat -gcutil [pid] [interval_ms] [iteration_count]
Example:
jstat -gcutil 373 2000 5
Output fields indicate:
- S0: Survivor space 0 utilization
- S1: Survivor space 1 utilization
- E: Eden space utilization
- O: Old generation utilization
- M: Metaspace utilization
- YGC: Young generation collection count
- FGC: Full generation collection count
- FGCT: Full generation collection time
- GCT: Total garbage collection time
Excessive GC frequency or extended colection times warrant deeper investigation into memory allocation patterns.
Streamlined Diagnosis with Arthas
Arthas (Alibaba's open-source Java diagnostic framework) provides an integrated solution supporting JDK 6+ across Linux, macOS, and Windows. Its interactive command-line interface offers tab completion and comprehensive troubleshooting capabilities.
Installing Arthas
curl -O https://alibaba.github.io/arthas/arthas-boot.jar
java -jar arthas-boot.jar
Select the target Java process from the presented list. The interface displays as:
[arthas@373]$
Dashboard Overview
Execute the dashboard command to observe real-time thread activity and CPU utilization:
dashboard
The dashboard highlights threads consuming significant CPU resources alongside GC metrics.
Thread-Specific Analysis
Isolate individual thread behavior:
thread [thread_id]
thread 32
This reveals method calls within the thread's execution path, directly identifying problematic code sections.
Code Decompilation
Arthas enables runtime code inspection without restarting the application:
jad com.rainbow.demo.service.TestWhile
This displays decompiled bytecode, revealing implementation details of suspected methods.
Terminating the Session
quit
Automated排查 with show-busy-java-threads
The show-busy-java-threads script from the useful-scripts collection automates CPU bottleneck identification. It leverages jstack internally while presenting results in an accessible format.
Acquisition
wget https://github.com/oldratlee/useful-scripts/releases/download/v2.7/show-busy-java-threads.sh
Execution
chmod +x show-busy-java-threads.sh
./show-busy-java-threads.sh
The script enumerates the top CPU-consuming threads, displaying method call chains responsible for CPU pressure. This approach significantly accelerates diagnostic workflows compared to manual jstack analysis.
Resolution Pathway
- Identify threads exhibiting high CPU consumption
- Translate thread IDs to hexadecimal format
- Extract stack traces to locate problematic methods
- Verify whether GC activity contributes to elevated utilization
- Examine application code for infinite loops, excessive allocations, or inefficient algorithms
- Implement corrective measures and validate resolution
Effective CPU troubleshooting combines multiple tools—jstack for initial capture, jstat for GC analysis, Arthas for comprehensive runtime inspection—for timely incident resolution.