✅ 1. Objective
๐ Correlate:
- OS layer →
iostat(disk bottleneck) - DB layer → AWR (SQL causing I/O)
๐ Goal: Identify which SQL caused the spike on dm- disks*
๐ง ✅ 2. Correlation Logic (Core Concept)
Disk spike (iostat)
↓
Time window
↓
AWR snapshot
↓
Top IO SQL
↓
Execution plan
↓
Root cause
⏱️ ✅ 3. STEP 1: Identify Spike Time from iostat
Example:
dm-69 → 100% util
await → 97 ms
๐ Note:
- Exact time window (e.g.)
10:02 AM – 10:10 AM
๐ ✅ 4. STEP 2: Find Matching AWR Snapshot
๐ Pick snapshots covering spike:
Snap 101 → 10:00
Snap 102 → 10:10
๐ฅ ✅ 5. STEP 3: Identify Top SQL by I/O
✅ Query 1: Top Disk Read SQL
✅ Query 2: Top Read Throughput
✅ Query 3: Full Table Scan Candidates
๐ ✅ 6. STEP 4: Identify Wait Events
✅ Interpretation:
| Wait Event | Meaning |
|---|---|
| db file scattered read | Full table scan ๐จ |
| db file sequential read | Index access |
| direct path read | Big scans (DW) |
| db file parallel read | Parallel scans |
๐ ✅ 7. STEP 5: Map SQL → Execution Plan
✅ Look for:
TABLE ACCESS FULL๐จINDEX RANGE SCANFULL OUTER JOIN- Parallel operations
๐ ✅ 8. STEP 6: Correlate with Disk Pattern
๐ด Case 1 (Your Earlier Example)
iostat:
avgrq-sz ~ 1024 KB
high rMB/s
high await
๐ AWR shows:
db file scattered read
direct path read
✅ Conclusion: Large full table scans causing disk saturation
๐ข Case 2
small avgrq-sz
high IOPS
low await
๐ AWR:
db file sequential read
✅ Conclusion: Healthy OLTP workload
๐งช ✅ 9. Advanced Correlation Query (BEST ONE)
๐ฏ ✅ 10. Root Cause Identification Matrix
| iostat Pattern | AWR Finding | Root Cause |
|---|---|---|
| Large IO + high latency | scattered read | Full table scan |
| High read MB/s | direct path read | Large query |
| High write latency | log file sync | Commit bottleneck |
| High queue | many active sessions | Concurrency overload |
๐จ ✅ 11. Real Example (Like Your Case)
iostat
%util = 100
await = 97 ms
avgqu-sz = 24
rMB/s = high
AWR
Event: db file scattered read
SQL_ID: abc123xyz
Disk Reads: very high
Plan: TABLE ACCESS FULL
✅ FINAL ROOT CAUSE
๐ A large SQL doing full table scan
๐ Saturating disk -> causing high latency
✅ ✅ 12. Fix Strategy
✅ SQL Level
- Add indexes
- Rewrite query
- Avoid full scans
✅ DB Level
- Increase buffer cache
- Enable parallel limits
✅ Storage Level
- Spread datafiles
- Use faster disk tier
✅ ✅ 13. Ultimate One-Liner Workflow (Production)
iostat spike → note time
→ find AWR snapshot
→ find top IO SQL
→ check wait events
→ review execution plan
→ fix SQL