Tuesday, May 26, 2026

iostat linux command deep drive to troubleshooting the performance issue

iostat -xm 2 5 | awk '$1 ~ /^(sd|dm)/ && $NF > 40 {printf "%-10s %s\n",$1,$NF"%"}'

iostat -xm 2 5 | awk '$NF > 40 {print}'

iostat -xm 2 5 | awk '/Device/ {print; next}$1 ~ /^(sd|dm)/ && $NF > 90 {print}'

📌 Header Breakdown (Deep Explanation)

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util

✅ 1. Device

Logical or physical disk name
- sdX → physical disks
- dm-X → device mapper (LVM, ASM, multipath)

👉 In your case:

dm-* = logical volumes / DB storage layers

✅ 2. rrqm/s (Read Requests Merged per second)

Number of read requests merged by OS scheduler

Why merging matters:

OS combines adjacent reads to reduce I/O calls

👉 Example:

10 small reads → merged → 1 large read

✅ Interpretation:

High value → efficient sequential I/O
Zero → either random I/O or already optimized

✅ 3. wrqm/s (Write Requests Merged per second)

Same as above but for writes

✅ High value:

Good for sequential writes (e.g., redo logs, batch loads)

✅ 4. r/s (Reads per second)

Number of read I/O operations per second

Interpretation:

High r/s = high IOPS (random access likely)

✅ 5. w/s (Writes per second)

Number of write operations per second

👉 Together with r/s:

Indicates workload type:
- OLTP → high r/s + w/s, small IO
- Analytics → lower r/s but large I/O size

✅ 6. rMB/s (Read throughput in MB/sec)

Total data read per second

✅ 7. wMB/s (Write throughput in MB/sec)

Total data written per second

🔎 Important:

Pattern	Meaning
High r/s + low rMB/s	small random IO
Low r/s + high rMB/s	large sequential IO

✅ 8. avgrq-sz (Average Request Size)

Average size of each I/O request (in KB)

Formula:

avgrq-sz = (total sectors read+written) / total I/O ops

Interpretation:

Value	Meaning
< 32 KB	random IO (OLTP)
64–256 KB	mixed
~1024 KB (1MB)	sequential scan

✅ 9. avgqu-sz (Average Queue Length)

Number of I/O requests waiting in queue

🚨 Critical metric:

Value	Impact
< 1	healthy
1–5	moderate
10+	pressure
20+	severe bottleneck

👉 High value means:

Disk is overloaded
Requests are waiting → latency increase

✅ 10. await (Average Wait Time in ms)

Total time for I/O request:
```
wait time = queue time + service time
```

🚨 Thresholds:

Value	Meaning
< 5 ms	excellent
5–20 ms	acceptable
20–50 ms	warning
> 50 ms	serious issue

👉 This is the most important latency metric

✅ 11. r_await (Read latency)

Avg time for read requests

✅ 12. w_await (Write latency)

Avg time for write requests

Why split matters:

Helps identify:
- read-heavy issues (full scan)
- write bottlenecks (redo/log/file sync)

✅ 13. svctm (Service Time)

Time taken by disk to service request
Does NOT include queue time

Important:

await ≈ svctm + queue delay

Interpretation:

Case	Meaning
await ≈ svctm	no queue bottleneck
await >> svctm	queue contention

👉 This is key for bottleneck detection

✅ 14. %util (Utilization)

Percentage of time disk was busy

🚨 Interpretation:

Value	Meaning
< 60%	safe
60–80%	moderate
80–90%	high
> 90%	saturated

👉 BUT:

Must combine with await + queue

🔥 Important Combined Interpretation

✅ Case 1 (Healthy high usage)

%util = 95%
await = 1 ms
avgqu-sz = 1

✔ Efficient disk

🚨 Case 2 (Bottleneck)

%util = 99%
await = 80 ms
avgqu-sz = 20

❌ Disk saturation + queue buildup

🧠 How You Should Read Header (DBA Cheat Sheet)

Step-by-step analysis:

Check %util
- 90 → possible saturation
Check avgqu-sz
- High → queue backlog
Check await
- Confirms latency impact
Compare await vs svctm
- Big gap → queue delay
Check avgrq-sz
- Understand workload type

🎯 Why This Matters for You (Database Architect)

This header directly helps identify:

✅ DB Issues Mapping

Metric	DB Problem
High rMB/s + large avgrq-sz	full table scan
High r/s, low size	index lookup
High w_await	commit / redo issues
High avgqu-sz	storage contention
High await	slow queries

✅ Final Summary

r/s, w/s → IOPS
rMB/s, wMB/s → throughput
avgrq-sz → IO size (random vs sequential)
avgqu-sz → pressure indicator 🚨
await → real latency 🚨
%util → saturation signal

ORACLE DATABASE PROBLEM AND SOLUTIONS

Tuesday, May 26, 2026

iostat linux command deep drive to troubleshooting the performance issue

📌 Header Breakdown (Deep Explanation)

✅ 1. Device

✅ 2. rrqm/s (Read Requests Merged per second)

Why merging matters:

✅ 3. wrqm/s (Write Requests Merged per second)

✅ 4. r/s (Reads per second)

Interpretation:

✅ 5. w/s (Writes per second)

✅ 6. rMB/s (Read throughput in MB/sec)

✅ 7. wMB/s (Write throughput in MB/sec)

🔎 Important:

✅ 8. avgrq-sz (Average Request Size)

Formula:

Interpretation:

✅ 9. avgqu-sz (Average Queue Length)

🚨 Critical metric:

✅ 10. await (Average Wait Time in ms)

🚨 Thresholds:

✅ 11. r_await (Read latency)

✅ 12. w_await (Write latency)

Why split matters:

✅ 13. svctm (Service Time)

Important:

Interpretation:

✅ 14. %util (Utilization)

🚨 Interpretation:

🔥 Important Combined Interpretation

✅ Case 1 (Healthy high usage)

🚨 Case 2 (Bottleneck)

🧠 How You Should Read Header (DBA Cheat Sheet)

Step-by-step analysis:

🎯 Why This Matters for You (Database Architect)

✅ DB Issues Mapping

✅ Final Summary

No comments:

Post a Comment

How to know and troubleshoot how much time a session has waited on each wait event ?