CPU Usage Calculation: ps vs top
Understanding CPU Usage Computation
CPU usage essentially measures the time a processor spends executing a process. In Linux systems, this execution time is tracked in units called jiffies. By calculating jiffies * HZ, we can determine the CPU time consumed by a process. Dividing this value by the total CPU time yields the CPU usage percentage: jiffies * HZ / total_time.
Contrasting ps and top
Both ps and top are widely used for monitoring CPU usage, yet they employ fundamentally different methodologies.
We can demonstrate this difference using a simple Go program:
package main
import (
"bytes"
"fmt"
"strconv"
"sync"
"time"
)
var testData = []byte(`testdata`)
func testBuffer(idx int) {
m := make(map[string]*bytes.Buffer)
for i := 0; i < 100; i++ {
buf, ok := m[strconv.Itoa(i)]
if !ok {
buf = new(bytes.Buffer)
}
for j := 0; j < 1024; j++ {
buf.Write(testData)
}
m[strconv.Itoa(i)] = buf
}
fmt.Println("done, ", idx)
wg.Done()
}
var wg sync.WaitGroup
func main() {
for i := 0; i < 10; i++ {
wg.Add(1)
go testBuffer(i)
}
wg.Wait()
fmt.Println("sleeping")
time.Sleep(time.Hour)
}
Observing the Discrepancy
Running this program and checking its CPU usage with top and ps reveals notable differences:
Using top -n 1:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
39753 infini 20 0 14.663g 0.014t 1200 S 611.1 22.2 0:23.53 test-cpu
Using ps aux:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
infini 39881 767 39.1 26505284 25791892 pts/16 Sl+ 07:04 0:38 ./test-cpu
After the testBuffer completes, top reports near-zero CPU usage while ps still shows high usage:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
infini 39881 82.3 42.4 28638148 27953532 pts/16 Sl+ 07:04 0:40 ./test-cpu
Explaining the Difference
The discrepancy arises from their execution methods. top operates over a duration, while ps provides an instantaneous snapshot. This affects their calculation methods:
ps calculates average CPU usage over the process's lifetime:
(total_cpu_time / total_process_uptime)
This means even after CPU usage drops, ps will report gradually decreasing values as the process continues running.
top, however, computes usage over a measured interval:
(current_cpu_time - last_cpu_time) / iteration_duration
Continuous CPU Usage Monitoring
Monitoring systems typically consist of data collection and visualization components. While ps is suitable for collecting instantaneous values, it only provides average usage rather than real-time metrics.
For accurate real-time monitoring, we need to collect raw CPU time data and calculate usage based on sampling intervals:
delta(cpu_time) / delta(timestamp)
In Linux, both tools obtain data from /proc/[PID]/stat:
utime: CPU time spent in user code (jiffies)stime: CPU time spent in kernel code (jiffies)cutime: User code time including chlidren (jiffies)cstime: Kernel code time including children (jiffies)
Using derivative functions to compare these values against timestamps provides accurate real-time CPU usage metrics.