iostat -xm 2 5 | awk '$1 ~ /^(sd|dm)/ && $NF > 40 {printf "%-10s %s\n",$1,$NF"%"}'
iostat -xm 2 5 | awk '$NF > 40 {print}'
iostat -xm 2 5 | awk '/Device/ {print; next}$1 ~ /^(sd|dm)/ && $NF > 90 {print}'
📌 Header Breakdown (Deep Explanation)
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
✅ 1. Device
- Logical or physical disk name
sdX→ physical disksdm-X→ device mapper (LVM, ASM, multipath)
👉 In your case:
dm-*= logical volumes / DB storage layers
✅ 2. rrqm/s (Read Requests Merged per second)
- Number of read requests merged by OS scheduler
Why merging matters:
- OS combines adjacent reads to reduce I/O calls
👉 Example:
- 10 small reads → merged → 1 large read
✅ Interpretation:
- High value → efficient sequential I/O
- Zero → either random I/O or already optimized
✅ 3. wrqm/s (Write Requests Merged per second)
- Same as above but for writes
✅ High value:
- Good for sequential writes (e.g., redo logs, batch loads)
✅ 4. r/s (Reads per second)
- Number of read I/O operations per second
Interpretation:
- High r/s = high IOPS (random access likely)
✅ 5. w/s (Writes per second)
- Number of write operations per second
👉 Together with r/s:
- Indicates workload type:
- OLTP → high r/s + w/s, small IO
- Analytics → lower r/s but large I/O size
✅ 6. rMB/s (Read throughput in MB/sec)
- Total data read per second
✅ 7. wMB/s (Write throughput in MB/sec)
- Total data written per second
🔎 Important:
| Pattern | Meaning |
|---|---|
| High r/s + low rMB/s | small random IO |
| Low r/s + high rMB/s | large sequential IO |
✅ 8. avgrq-sz (Average Request Size)
- Average size of each I/O request (in KB)
Formula:
avgrq-sz = (total sectors read+written) / total I/O ops
Interpretation:
| Value | Meaning |
|---|---|
| < 32 KB | random IO (OLTP) |
| 64–256 KB | mixed |
| ~1024 KB (1MB) | sequential scan |
✅ 9. avgqu-sz (Average Queue Length)
- Number of I/O requests waiting in queue
🚨 Critical metric:
| Value | Impact |
|---|---|
| < 1 | healthy |
| 1–5 | moderate |
| 10+ | pressure |
| 20+ | severe bottleneck |
👉 High value means:
- Disk is overloaded
- Requests are waiting → latency increase
✅ 10. await (Average Wait Time in ms)
- Total time for I/O request:
wait time = queue time + service time
🚨 Thresholds:
| Value | Meaning |
|---|---|
| < 5 ms | excellent |
| 5–20 ms | acceptable |
| 20–50 ms | warning |
| > 50 ms | serious issue |
👉 This is the most important latency metric
✅ 11. r_await (Read latency)
- Avg time for read requests
✅ 12. w_await (Write latency)
- Avg time for write requests
Why split matters:
- Helps identify:
- read-heavy issues (full scan)
- write bottlenecks (redo/log/file sync)
✅ 13. svctm (Service Time)
- Time taken by disk to service request
- Does NOT include queue time
Important:
await ≈ svctm + queue delay
Interpretation:
| Case | Meaning |
|---|---|
| await ≈ svctm | no queue bottleneck |
| await >> svctm | queue contention |
👉 This is key for bottleneck detection
✅ 14. %util (Utilization)
- Percentage of time disk was busy
🚨 Interpretation:
| Value | Meaning |
|---|---|
| < 60% | safe |
| 60–80% | moderate |
| 80–90% | high |
| > 90% | saturated |
👉 BUT:
- Must combine with await + queue
🔥 Important Combined Interpretation
✅ Case 1 (Healthy high usage)
%util = 95%
await = 1 ms
avgqu-sz = 1
✔ Efficient disk
🚨 Case 2 (Bottleneck)
%util = 99%
await = 80 ms
avgqu-sz = 20
❌ Disk saturation + queue buildup
🧠How You Should Read Header (DBA Cheat Sheet)
Step-by-step analysis:
Check %util
90 → possible saturation
Check avgqu-sz
- High → queue backlog
Check await
- Confirms latency impact
Compare await vs svctm
- Big gap → queue delay
Check avgrq-sz
- Understand workload type
🎯 Why This Matters for You (Database Architect)
This header directly helps identify:
✅ DB Issues Mapping
| Metric | DB Problem |
|---|---|
| High rMB/s + large avgrq-sz | full table scan |
| High r/s, low size | index lookup |
| High w_await | commit / redo issues |
| High avgqu-sz | storage contention |
| High await | slow queries |
✅ Final Summary
r/s, w/s→ IOPSrMB/s, wMB/s→ throughputavgrq-sz→ IO size (random vs sequential)avgqu-sz→ pressure indicator 🚨await→ real latency 🚨%util→ saturation signal
No comments:
Post a Comment