One‑Command “Master View” (Highly Recommended)
🔍 What it shows (and why it matters)
| Field | Why it’s important |
|---|---|
user | Who owns the process |
pid | Process ID |
ppid | Parent process (helps detect orphans) |
stat | Process state (R/S/D/Z/T) |
%cpu | CPU consumption |
%mem | Memory usage |
etime | How long the process has been running |
lstart | Exact start time |
wchan | Kernel wait channel (I/O diagnosis) |
comm | Executable name |
--sort=-%cpu | Top CPU consumers first |
✅ This is your best single snapshot for general troubleshooting
2️⃣ CPU Troubleshooting (High CPU / Run Queues)
Key columns
| Column | Meaning |
|---|---|
psr | Which CPU core it’s running on |
pri | Kernel priority |
ni | Nice value |
time | Total CPU time consumed |
✅ Use when:
- Load average is high
- CPU is saturated
- Performance complaints
3️⃣ I/O Troubleshooting (MOST CRITICAL)
🔥 Identify blocked processes (D state)
Why this is powerful
| Field | Purpose |
|---|---|
D | Uninterruptible sleep (I/O wait) |
wchan | What kernel function it’s stuck on |
etime | How long it has been blocked |
Common wchan values and meaning
| wchan | Meaning |
|---|---|
io_schedule | Disk I/O wait |
wait_on_page_bit | Memory/disk interaction |
nfs_wait | NFS hang |
blk_mq_get_tag | Storage queue congestion |
🚨 If Oracle or DB processes appear here → storage issue almost guaranteed
4️⃣ Memory & Leak Detection
Key fields
| Field | Meaning |
|---|---|
rss | Real memory in KB |
vsz | Virtual memory |
%mem | RAM usage |
✅ Use when:
- System is swapping
- OOM killer events
- Slow performance despite low CPU
5️⃣ Full Command, Arguments & Environment
Why this matters:
cmdshows complete arguments- Crucial for:
- Java tuning
- Oracle startup flags
- Application misconfiguration
6️⃣ Zombie Process Detection
Why care?
- Zombies indicate parent process bug
- Can exhaust PID space
- Need parent restart (not
kill)
7️⃣ Oracle / Database‑Focused View (DBA Favorite)
✅ Detects:
- DBWR / LGWR I/O stalls
- Parallel worker hangs
- Backup‑related blockages
8️⃣ Thread‑Level Analysis (Advanced CPU Debugging)
Use when:
- Java or Oracle shows high CPU
- Need hot thread detection
- Correlating with
perf/jstack
9️⃣ Parent‑Child Relationship Analysis
✅ Great for:
- Detecting fork storms
- Tracing hung parent processes
- Understanding service trees
10️⃣ Minimal “Health Check” Command (Quick & Safe)
✅ Safe for production
✅ Quick triage
✅ Covers 80% of issues
🔑 What to Focus On (Cheat Sheet)
| Symptom | Look at |
|---|---|
| High load | %cpu, R state |
| Stuck system | D state, wchan |
| Slowness | %cpu, %mem, etime |
| Hung DB | ora_* + D |
| Memory issues | rss, %mem |
| Defunct processes | Z |
✅ Final Recommendation (What to Remember)
If you remember only ONE command, make it this:
ps -eo user,pid,ppid,stat,%cpu,%mem,etime,wchan,comm --sort=-%cpu
This single command gives: ✅ CPU
✅ I/O
✅ Memory
✅ State
✅ Ownership
✅ Runtime
✅ Kernel wait reason
No comments:
Post a Comment