Wednesday, May 27, 2026

How to Identify and fix Index is Causing Performance Issue in oracle database ?

🔎 PART 1 — How to Identify Index is Causing Performance Issue

✅ Step 1: Find Slow SQL (Entry Point)

Start from Top SQL, not from the index.

SELECT sql_id, executions, elapsed_time, buffer_gets

FROM v$sql

ORDER BY elapsed_time DESC FETCH FIRST 10 ROWS ONLY;

👉 Pick the problematic SQL

✅ Step 2: Check Execution Plan

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('sql_id', NULL, 'ALLSTATS LAST'));

🔍 Look for:

Symptom	Meaning
FULL TABLE SCAN	Index missing or ignored ❌
INDEX RANGE SCAN with high cost	Index inefficient ❌
INDEX FULL SCAN	Poor selectivity ❌

✅ Step 3: Check Index Access Cost

In plan, observe:

Cost
Rows
Bytes
Buffer Gets

👉 If index scan:

Reads too many blocks → Index is not selective

✅ Step 4: Check Clustering Factor (Critical)

SELECT index_name, clustering_factor, num_rows
FROM dba_indexes
WHERE index_name = 'INDEX_NAME';

Interpretation:

CF ≈ number of rows → ❌ Random reads (BAD index)
CF ≈ number of blocks → ✅ Efficient index

👉 High CF → more I/O → slower query

✅ Step 5: Check Selectivity (Very Important)

SELECT num_distinct, num_rows
FROM dba_tab_col_statistics
WHERE table_name = 'TABLE_NAME'
AND column_name = 'COLUMN_NAME';

👉 Formula:

Selectivity = num_distinct / num_rows

Result:

Value	Meaning
High (close to 1)	✅ Good index
Low (<0.1)	❌ Bad index

👉 Example:

status = 'ACTIVE' → bad index
customer_id → good index

✅ Step 6: Check If Index is Used Properly

ALTER INDEX schema.index_name MONITORING USAGE;

Then:

SELECT * FROM v$object_usage WHERE index_name = 'INDEX_NAME';

👉 Result:

USED = NO → Index useless
USED = YES but slow → Index is problem

✅ Step 7: Check Logical Reads (Performance Impact)

FROM v$segment_statistics
WHERE object_name = 'INDEX_NAME'
AND statistic_name = 'logical reads';

👉 High logical reads = heavy I/O → possible issue

✅ Step 8: Check Index BLEVEL (Depth)

SELECT index_name, blevel
FROM dba_indexes
WHERE index_name = 'INDEX_NAME';

BLEVEL	Meaning
0–2	✅ Good
3–4	⚠️ Ok
>4	❌ Too deep

✅ Step 9: Validate Stats

EXEC DBMS_STATS.GATHER_INDEX_STATS('SCHEMA','INDEX_NAME');

👉 Bad stats = wrong execution plan

✅ Step 10: Compare With Full Table Scan

Force test:

SELECT /*+ FULL(table_name) */ ..

👉 If FULL scan is faster → Index is hurting performance

🚨 Conclusion (Index Causing Issue If):

High clustering factor
Low selectivity
High logical reads
Not used or incorrectly used
Execution plan shows inefficiency

✅ PART 2 — How to Check Index is GOOD or BAD

🎯 Step-by-Step Evaluation Framework

✅ Step 1: Usage Check

SELECT * FROM v$object_usage

WHERE index_name = 'INDEX_NAME';

Result	Decision
USED frequently	✅ Good
NOT USED	❌ Drop candidate

✅ Step 2: Query Alignment

👉 Check if index matches WHERE clause:

WHERE col1 = :1 AND col2 = :2

👉 Good index:

(col1, col2)

👉 Bad index:

(col3, col4)

✅ Step 3: Selectivity Check

High distinct values → ✅ Good
Low distinct values → ❌ Bad

✅ Step 4: Clustering Factor

Low CF → ✅ Good
High CF → ❌ Bad

✅ Step 5: Size vs Benefit

SELECT bytes/1024/1024 MB
FROM dba_segments
WHERE segment_name='INDEX_NAME';

👉 Large + rarely used → ❌ Bad

✅ Step 6: DML Overhead Check

👉 Too many indexes on high DML table:

Slows INSERT/UPDATE/DELETE

SELECT index_name
FROM dba_indexes
WHERE table_name = 'TABLE_NAME';

👉 More indexes ≠ better

✅ Step 7: Redundancy Check

👉 Duplicate indexes = waste

✅ Step 8: BLEVEL + Leaf Blocks

👉 Small, compact index = good
👉 Huge fragmented index = bad

✅ FINAL DECISION MATRIX

Condition	Good Index ✅	Bad Index ❌
Usage	Frequent	Not used
Execution Plan	Index scan	Full scan
Selectivity	High	Low
Clustering Factor	Low	High
BLEVEL	≤2	>4
Size	Optimized	Huge unused
DML Impact	Low	High

🔥 REAL-WORLD QUICK CHECK SCRIPT

i.index_name,
i.clustering_factor,
i.num_rows,
i.distinct_keys,
s.bytes/1024/1024 MB
FROM dba_indexes i
JOIN dba_segments s
ON i.index_name = s.segment_name
WHERE i.owner = 'SCHEMA_NAME'
ORDER BY s.bytes DESC;

💡 Architect-Level Insight (Important)

👉 Don’t judge index in isolation

Always evaluate:

SQL → Execution Plan → Index → Data Pattern

✅ Recommended Approach (Best Practice)

Identify top SQL (AWR/V$SQL)
Analyze execution plan
Validate index usage
Check clustering factor + selectivity
Decide: KEEP / REBUILD / DROP

INDEX HEALTH CLASSIFICATION SCRIPT

set lines 300 pages 500

col OWNER for a10

col INDEX_NAME for a50

col TABLE_NAME for a50

WITH idx_stats AS (

SELECT

i.owner,

i.index_name,

i.table_name,

i.blevel,

i.clustering_factor,

i.num_rows,

i.distinct_keys,

s.bytes/1024/1024 AS size_mb,

CASE

WHEN i.num_rows > 0

THEN ROUND(i.distinct_keys / i.num_rows, 4)

ELSE 0

END AS selectivity

FROM dba_indexes i

JOIN dba_segments s

ON i.owner = s.owner

AND i.index_name = s.segment_name

WHERE i.owner IN ('RITRS')

AND CLUSTERING_FACTOR >700

usage_stats AS (

SELECT

index_name,

CASE

WHEN used = 'YES' THEN 'USED'

ELSE 'NOT_USED'

END AS usage_status

FROM v$object_usage

)

SELECT

i.owner,

i.index_name,

i.table_name,

i.size_mb,

i.blevel,

i.clustering_factor,

i.num_rows,

i.distinct_keys,

i.selectivity,

NVL(u.usage_status, 'UNKNOWN') AS usage_status,

CASE

WHEN NVL(u.usage_status, 'NOT_USED') = 'NOT_USED'

THEN 'DROP_CANDIDATE'

WHEN i.selectivity < 0.05

THEN 'BAD_INDEX_LOW_SELECTIVITY'

WHEN i.clustering_factor > i.num_rows * 0.9

THEN 'BAD_INDEX_HIGH_CF'

WHEN i.blevel > 4

THEN 'REBUILD_REQUIRED'

WHEN i.size_mb > 500

AND NVL(u.usage_status, 'NOT_USED') <> 'USED'

THEN 'REVIEW_BIG_UNUSED_INDEX'

ELSE 'GOOD_INDEX'

END AS index_health_status

FROM idx_stats i

LEFT JOIN usage_stats u

ON i.index_name = u.index_name

ORDER BY i.size_mb DESC

Alert Fix : 1 index(es) with too many extents in oracle database - oracle performance issue

🔍 What it Means

Oracle stores segments (tables, indexes) in extents (contiguous blocks).
Over time, due to growth, rebuilds, and DML, an index may end up with:
- Hundreds or thousands of small extents
This is usually flagged when:

extents > threshold (e.g., 1000 or configurable limit)

⚠️ Why It Matters

Generally not critical in modern Oracle (ASSM + LMT) environments.
But can still cause:
- Slight performance overhead in segment management
- Longer backup/restore times
- Inefficient disk usage in older systems
- Red flag in audits / compliance reports

👉 In most modern systems, it's more of a hygiene issue than a real performance problem.

✅ Identify the Problem Index

Run:

SELECT owner, segment_name, segment_type, extents

FROM dba_segments

WHERE segment_type = 'INDEX'

AND extents > 500 -- adjust threshold

ORDER BY extents DESC;

Or specifically:

SELECT owner, index_name

FROM dba_indexes

WHERE index_name IN (

SELECT segment_name

FROM dba_segments

WHERE extents > 500

);

🛠️ Fix Options

✅ Option 1: Rebuild the Index (Recommended)

ALTER INDEX schema_name.index_name REBUILD;

Benefits:

Reduces number of extents
Compacts index
Eliminates fragmentation

👉 If large index:

ALTER INDEX schema_name.index_name REBUILD ONLINE;

✅ Option 2: Rebuild with Storage Clause

To control extent allocation:

ALTER INDEX schema_name.index_name REBUILD

STORAGE (INITIAL 100M NEXT 100M);

👉 Helps avoid many small extents in future.

✅ Option 3: Move to Different Tablespace

If fragmentation is due to tablespace issues:

ALTER INDEX schema_name.index_name

REBUILD TABLESPACE new_tablespace;

✅ Option 4: Coalesce (Less Effective)

ALTER INDEX schema_name.index_name COALESCE;

Merges adjacent leaf blocks
Does NOT reduce extents
Safer but less impactful

🧠 Root Cause Analysis

Common reasons:

Autoallocate tablespace creating many uneven extents
Frequent index growth/shrink cycles
Old dictionary-managed tablespaces (DMT)
Poor storage parameters
Partitioned index mismanagement

🎯 Best Practices (Going Forward)

✅ Use:

Locally Managed Tablespaces (LMT)
AUTOALLOCATE or UNIFORM extent size

✅ For large indexes:

Predefine proper extent sizes:

CREATE INDEX idx_name
ON table_name(column)
TABLESPACE ts_name
STORAGE (INITIAL 128M NEXT 128M);

✅ Monitor periodically:

SELECT segment_name, extents

FROM dba_segments

WHERE extents > 500;

🚨 When You Can Ignore It

You can safely ignore if:

Database uses ASSM + LMT
No performance issues observed
Extents are auto-managed efficiently

✅ Recommendation for You (Database Architect Perspective)

Given your role, I’d suggest:

✅ Validate index size vs extents ratio
✅ Rebuild only if:
- Extents are very high (>1000)
- Index is frequently accessed
✅ Automate check in health scripts
✅ Standardize storage clauses for large indexes

💬 Summary

Issue	Action
Too many extents	Rebuild index
Fragmentation	Rebuild or move
Minor issue	Ignore if no perf impact

Tuesday, May 26, 2026

Is CPU issue ? troubleshooting workflow for oracle database performance issue with automation

✅ ✅ 1. CPU Troubleshooting Framework (Like iostat for CPU)

Use:

vmstat 2 5

top

📊 ✅ 2. CPU Metrics Explained (vmstat / top)

us sy id wa st

Metric	Meaning
us	User CPU (app/DB queries)
sy	System CPU (kernel/syscalls)
id	Idle CPU
wa	IO wait
st	Steal (VM contention)

✅ ✅ 3. CPU Baseline Table (Production Standard)

📊 CPU Utilization

Metric	Good ✅	Warning ⚠️	Critical 🚨
idle (id)	> 40%	20–40%	< 20%
user (us)	< 50%	50–70%	> 70%
system (sy)	< 20%	20–30%	> 30%
iowait (wa)	< 5%	5–15%	> 20%
steal (st)	0	1–5	> 5

📊 Run Queue (Very Important)

r (run queue)

Metric	Good ✅	Warning ⚠️	Critical 🚨
r vs CPU cores	≤ cores	1.5x cores	2x+ cores

✅ ✅ 4. CPU Health Decision Matrix

Condition	Meaning
High us	CPU-heavy queries
High sy	kernel/system overhead
High wa	not CPU → disk issue
High r	CPU contention
High st	VM issue

✅ ✅ 5. CPU Scoring Model (Like Disk)

🎯 Formula

Score = 100 
- user penalty 
- system penalty 
- run queue penalty 
- iowait penalty

📊 Penalties

✅ User CPU

<50 → 0
50–70 → -15
70 → -30

✅ System CPU

<20 → 0
20–30 → -15
30 → -30

✅ Run Queue

≤ cores → 0
1–2x cores → -20
2x cores → -40

✅ IO Wait

<5 → 0
5–15 → -20
15 → -40

✅ Final Status

Score	Status
80–100	✅ Healthy
60–80	⚠️ Warning
40–60	🔶 Degraded
<40	🚨 Critical

✅ ✅ 6. CPU Monitoring Script (PRODUCTION READY)

📜 `cpu_health.sh`

#!/bin/bash

echo "===== CPU Health Check ====="
date

vmstat 2 3 | tail -1 | awk '
{
r=$1
us=$13
sy=$14
id=$15
wa=$16
st=$17

score=100

# user penalty
if (us > 70) score -= 30
else if (us > 50) score -= 15

# system penalty
if (sy > 30) score -= 30
else if (sy > 20) score -= 15

# iowait penalty
if (wa > 15) score -= 40
else if (wa > 5) score -= 20

# status
status="HEALTHY"
if (score < 40) status="CRITICAL"
else if (score < 60) status="DEGRADED"
else if (score < 80) status="WARNING"

printf "RunQ=%d User=%d%% Sys=%d%% Idle=%d%% IOwait=%d%% Status=%s Score=%d\n",r,us,sy,id,wa,status,score
}
'

Identify CPU Bottleneck (Root Cause)

✅ Step 1: Check top processes

top -o %CPU

✅ Step 2: Per process CPU

ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu | head

✅ Step 3: Thread-level (VERY IMPORTANT for DB)

top -H

✅ Step 4: Per process breakdown

pidstat -u 2

✅ ✅ 8. CPU Issue Patterns (DBA Mapping)

🔴 Pattern 1: High USER CPU

us > 70%

👉 Cause:

Complex SQL
Full scans
bad execution plans

✅ DB Mapping:

SELECT sql_id, cpu_time

FROM v$sql

ORDER BY cpu_time DESC FETCH FIRST 10 ROWS;

🔴 Pattern 2: High SYSTEM CPU

sy > 30%

👉 Cause:

excessive context switching
kernel overhead
I/O interrupts

✅ Check:

vmstat 1

Look at:

cs (context switches)
in (interrupts)

🔴 Pattern 3: High RUN QUEUE

r >> CPU cores

👉 Cause:

CPU contention
too many concurrent queries

🔴 Pattern 4: High IO WAIT

wa > 20%

👉 Cause:

NOT CPU problem
👉 disk bottleneck (go back to iostat)

🔴 Pattern 5: High STEAL (VM)

st > 5%

👉 Cause:

host contention (cloud/VM issue)

✅ ✅ 9. CPU Alert Script (Automation)

📜 `cpu_alert.sh`

#!/bin/bash

HOST=$(hostname)

ALERT=$(vmstat 2 2 | tail -1 | awk '
{
us=$13; sy=$14; wa=$16; st=$17

if (us>70 || sy>30 || wa>15 || st>5)
print "CPU Issue: user="us"% sys="sy"% wa="wa"% st="st"%"
}
')

if [ ! -z "$ALERT" ]; then
echo "$ALERT" | mail -s "CPU Alert on $HOST" your_email@company.com
fi
``

✅ ✅ 10. DBA Troubleshooting Flow

🔥 When CPU alert triggers:

1. Check:

vmstat 1 5

2. Identify pattern:

High us → SQL issue
High sy → OS/kernel
High wa → disk issue

3. Identify process:

top

4. Map to DB:

SELECT sql_id, cpu_time FROM v$sql ORDER BY cpu_time DESC;

5. Fix:

Tune query ✅
Index ✅
Reduce parallelism ✅
Limit sessions ✅

✅ ✅ 11. Golden Rules (CPU)

✅ Healthy

idle > 40%
wa < 5%
r <= cores

🚨 Critical CPU

idle < 10%
us > 70%
r > 2x cores

⚠️ Trick Case (VERY IMPORTANT)

CPU looks high BUT wa is high
👉 NOT CPU issue → DISK issue

🎯 Final Outcome

Now you have:

✅ CPU baseline table
✅ Health scoring model
✅ Monitoring script
✅ Alerting system
✅ DB correlation
✅ Troubleshooting flow

Is storage issue ? troubleshooting workflow for oracle database performance issue with automation

✅ 1. AUTO EMAIL ALERT SCRIPT (Real Production Use)

This script:

Detects disk bottlenecks
Sends email alert automatically

📜 Script: `disk_alert_mail.sh`

#!/bin/bash

HOST=$(hostname)
DATE=$(date)

ALERT=$(iostat -xm 2 2 | awk '
$1 ~ /^(sd|dm)/ {
util=$NF
await=$(NF-3)
queue=$(NF-4)

if (util>95 || await>50 || queue>10) {
printf "%-8s util=%s%% await=%sms queue=%s\n", $1, util, await, queue
}
}
')

if [ ! -z "$ALERT" ]; then
echo "Disk Alert on $HOST at $DATE

Critical disks detected:

$ALERT
" | mail -s "🚨 Disk I/O Alert on $HOST" your_email@company.com
fi

✅ Setup Cron (every 5 min)

crontab -e

Add:

*/5 * * * * /path/disk_alert_mail.sh

✅ ✅ 2. AUTO MAP dm-* → MOUNT → DB FILE (Game-Changer)

This solves your biggest pain: 👉 “Which DB object is causing this?”

📜 Script: `disk_to_db_map.sh`

#!/bin/bash

echo "==== Disk → Mount → Files ===="

for dev in $(lsblk -dn -o NAME | grep dm-); do
echo ""
echo "Device: $dev"

mount=$(lsblk -n -o MOUNTPOINT /dev/$dev 2>/dev/null)

if [ ! -z "$mount" ]; then
echo "Mount: $mount"
echo "Files:"
lsof +D $mount 2>/dev/null | head -10
else
echo "No mount (maybe ASM/LVM raw)"
fi
done

✅ Output Example

Device: dm-xx
Mount: /u02
Files:
oracle  db  datafile1.dbf
oracle  db  users01.dbf

✅ ✅ 3. CORRELATE iostat + ORACLE (Find exact SQL)

✅ Step 1: Identify heavy disk (from iostat)

Example:

dm-69 → /u02 → USERS tablespace

✅ Step 2: Find SQL causing reads

SELECT sql_id,
executions,
disk_reads,
buffer_gets,
ROUND(disk_reads/executions) avg_reads
ORDER BY disk_reads DESC
FETCH FIRST 10 ROWS ONLY;

✅ Step 3: Active session (real-time)

SELECT sid, serial#, sql_id, event
FROM v$session
WHERE wait_class='User I/O';

✅ Step 4: Wait event mapping

Wait Event	Meaning
`db file scattered read`	Full table scan 🚨
`db file sequential read`	Index lookup
`direct path read`	Large scan
`log file sync`	Commit slow

✅ ✅ 4. ADVANCED ALL-IN-ONE SCRIPT (MOST POWERFUL)

👉 Combines:

Disk health
Alert detection
Mapping

📜 `ultimate_disk_analyzer.sh`

#!/bin/bash

echo "===== Ultimate Disk Analyzer ====="
date
echo ""

iostat -xm 2 2 | awk '
/Device/ {
printf "%-8s %-6s %-6s %-6s %-10s\n","Dev","Util","Await","Queue","Status"
next
}

$1 ~ /^(sd|dm)/ {
util=$NF
await=$(NF-3)
queue=$(NF-4)

status="OK"

if (util>95 || await>50 || queue>10)
status="CRITICAL"
else if (util>80 || await>20 || queue>5)
status="WARNING"

printf "%-8s %-6.1f %-6.1f %-6.1f %-10s\n",$1,util,await,queue,status
}'

✅ ✅ 5. Real DBA Troubleshooting Flow (What YOU should do)

🔥 Step-by-step when alert triggers:

1. Run:

iostat -xm 2 5

2. Identify:

High %util
High await
High queue

3. Map:

lsblk

df -h

4. Check DB:

SELECT * FROM v$session WHERE event LIKE '%read%';

5. Fix:

Query tuning ✅
Indexing ✅
Move data ✅
Balance disks ✅

✅ ✅ 6. PRO TIPS (REAL WORLD)

🔴 When to PANIC

await > 100 ms
queue > 20
util = 100% sustained

🟢 When it's OK

util = 100% BUT await < 5
👉 (fast SSD under load)

🧠 Golden formula:

Problem = High Util + High Await + High Queue

🎯 Final Outcome

You now have:

✅ Auto alert system
✅ Disk → DB mapping
✅ SQL root cause detection
✅ Health scoring
✅ Full troubleshooting workflow

Approach to correlate AWR + iostat to deep drive and troubleshoot oracle database performance issue

✅ 1. Objective

👉 Correlate:

OS layer → iostat (disk bottleneck)
DB layer → AWR (SQL causing I/O)

👉 Goal: Identify which SQL caused the spike on dm- disks*

🧠 ✅ 2. Correlation Logic (Core Concept)

Disk spike (iostat)
      ↓
Time window
      ↓
AWR snapshot
      ↓
Top IO SQL
      ↓
Execution plan
      ↓
Root cause

⏱️ ✅ 3. STEP 1: Identify Spike Time from iostat

Example:

dm-69 → 100% util
await → 97 ms

👉 Note:

Exact time window (e.g.)

10:02 AM – 10:10 AM

📊 ✅ 4. STEP 2: Find Matching AWR Snapshot

SELECT snap_id, begin_interval_time, end_interval_time
FROM dba_hist_snapshot
ORDER BY snap_id DESC;

👉 Pick snapshots covering spike:

Snap 101 → 10:00
Snap 102 → 10:10

🔥 ✅ 5. STEP 3: Identify Top SQL by I/O

✅ Query 1: Top Disk Read SQL

executions_delta,
disk_reads_delta,
buffer_gets_delta,
elapsed_time_delta/1000000 elapsed_sec,
ROUND(disk_reads_delta/DECODE(executions_delta,0,1,executions_delta)) reads_per_exec
FROM dba_hist_sqlstat
WHERE snap_id BETWEEN :snap1 AND :snap2
ORDER BY disk_reads_delta DESC
WHERE ROWNUM <= 10;

✅ Query 2: Top Read Throughput

disk_reads_delta,
buffer_gets_delta,
rows_processed_delta,
elapsed_time_delta/1000000 elapsed_sec
FROM dba_hist_sqlstat
WHERE snap_id BETWEEN :snap1 AND :snap2
ORDER BY disk_reads_delta DESC
WHERE ROWNUM <= 10;

✅ Query 3: Full Table Scan Candidates

SELECT sql_id,
disk_reads_delta,
executions_delta,
ROUND(disk_reads_delta/DECODE(executions_delta,0,1,executions_delta)) reads_per_exec
FROM dba_hist_sqlstat
WHERE snap_id BETWEEN :snap1 AND :snap2
AND disk_reads_delta > 100000
ORDER BY disk_reads_delta DESC;

🔍 ✅ 6. STEP 4: Identify Wait Events

SELECT event, total_waits_delta, time_waited_delta/1000 time_waited_ms
FROM dba_hist_system_event
WHERE snap_id BETWEEN :snap1 AND :snap2
AND event LIKE 'db file%'
ORDER BY time_waited_ms DESC;

✅ Interpretation:

Wait Event	Meaning
db file scattered read	Full table scan 🚨
db file sequential read	Index access
direct path read	Big scans (DW)
db file parallel read	Parallel scans

📈 ✅ 7. STEP 5: Map SQL → Execution Plan

FROM dba_hist_sql_plan
WHERE sql_id = '&sql_id'
ORDER BY id;

✅ Look for:

TABLE ACCESS FULL 🚨
INDEX RANGE SCAN
FULL OUTER JOIN
Parallel operations

🔗 ✅ 8. STEP 6: Correlate with Disk Pattern

🔴 Case 1 (Your Earlier Example)

iostat:
avgrq-sz ~ 1024 KB
high rMB/s
high await

👉 AWR shows:

db file scattered read
direct path read

✅ Conclusion: Large full table scans causing disk saturation

🟢 Case 2

small avgrq-sz
high IOPS
low await

👉 AWR:

db file sequential read

✅ Conclusion: Healthy OLTP workload

🧪 ✅ 9. Advanced Correlation Query (BEST ONE)

t.sql_text,
s.executions_delta,
s.disk_reads_delta,
ROUND(s.disk_reads_delta/DECODE(s.executions_delta,0,1,s.executions_delta)) reads_per_exec,
s.elapsed_time_delta/1000000 elapsed_sec
FROM dba_hist_sqlstat s,
dba_hist_sqltext t
WHERE s.sql_id = t.sql_id
AND s.snap_id BETWEEN :snap1 AND :snap2
ORDER BY s.disk_reads_delta DESC
FETCH FIRST 10 ROWS ONLY;

🎯 ✅ 10. Root Cause Identification Matrix

iostat Pattern	AWR Finding	Root Cause
Large IO + high latency	scattered read	Full table scan
High read MB/s	direct path read	Large query
High write latency	log file sync	Commit bottleneck
High queue	many active sessions	Concurrency overload

🚨 ✅ 11. Real Example (Like Your Case)

iostat

%util = 100
await = 97 ms
avgqu-sz = 24
rMB/s = high

AWR

Event: db file scattered read
SQL_ID: abc123xyz
Disk Reads: very high
Plan: TABLE ACCESS FULL

✅ FINAL ROOT CAUSE

👉 A large SQL doing full table scan
👉 Saturating disk -> causing high latency

✅ ✅ 12. Fix Strategy

✅ SQL Level

Add indexes
Rewrite query
Avoid full scans

✅ DB Level

Increase buffer cache
Enable parallel limits

✅ Storage Level

Spread datafiles
Use faster disk tier

✅ ✅ 13. Ultimate One-Liner Workflow (Production)

iostat spike → note time
→ find AWR snapshot
→ find top IO SQL
→ check wait events
→ review execution plan
→ fix SQL

Wednesday, May 27, 2026

🔎 PART 1 — How to Identify Index is Causing Performance Issue

✅ Step 1: Find Slow SQL (Entry Point)

✅ Step 2: Check Execution Plan

🔍 Look for:

✅ Step 3: Check Index Access Cost

✅ Step 4: Check Clustering Factor (Critical)

Interpretation:

✅ Step 5: Check Selectivity (Very Important)

Result:

✅ Step 6: Check If Index is Used Properly

✅ Step 7: Check Logical Reads (Performance Impact)

✅ Step 8: Check Index BLEVEL (Depth)

✅ Step 9: Validate Stats

✅ Step 10: Compare With Full Table Scan

🚨 Conclusion (Index Causing Issue If):

✅ PART 2 — How to Check Index is GOOD or BAD

🎯 Step-by-Step Evaluation Framework

✅ Step 1: Usage Check

✅ Step 2: Query Alignment

✅ Step 3: Selectivity Check

✅ Step 4: Clustering Factor

✅ Step 5: Size vs Benefit

✅ Step 6: DML Overhead Check

✅ Step 7: Redundancy Check

✅ Step 8: BLEVEL + Leaf Blocks

✅ FINAL DECISION MATRIX

🔥 REAL-WORLD QUICK CHECK SCRIPT

💡 Architect-Level Insight (Important)

✅ Recommended Approach (Best Practice)

INDEX HEALTH CLASSIFICATION SCRIPT

🔍 What it Means

⚠️ Why It Matters

✅ Identify the Problem Index

🛠️ Fix Options

✅ Option 1: Rebuild the Index (Recommended)

✅ Option 2: Rebuild with Storage Clause

✅ Option 3: Move to Different Tablespace

✅ Option 4: Coalesce (Less Effective)

🧠 Root Cause Analysis

🎯 Best Practices (Going Forward)

🚨 When You Can Ignore It

✅ Recommendation for You (Database Architect Perspective)

💬 Summary

Tuesday, May 26, 2026

✅ ✅ 1. CPU Troubleshooting Framework (Like iostat for CPU)

📊 ✅ 2. CPU Metrics Explained (vmstat / top)

✅ ✅ 3. CPU Baseline Table (Production Standard)

📊 CPU Utilization

📊 Run Queue (Very Important)

✅ ✅ 4. CPU Health Decision Matrix

✅ ✅ 5. CPU Scoring Model (Like Disk)

🎯 Formula

📊 Penalties

✅ User CPU

✅ System CPU

✅ Run Queue

✅ IO Wait

✅ Final Status

✅ ✅ 6. CPU Monitoring Script (PRODUCTION READY)

📜 cpu_health.sh

✅ Step 1: Check top processes

✅ Step 2: Per process CPU

✅ Step 3: Thread-level (VERY IMPORTANT for DB)

✅ Step 4: Per process breakdown

✅ ✅ 8. CPU Issue Patterns (DBA Mapping)

🔴 Pattern 1: High USER CPU

🔴 Pattern 2: High SYSTEM CPU

🔴 Pattern 3: High RUN QUEUE

🔴 Pattern 4: High IO WAIT

🔴 Pattern 5: High STEAL (VM)

✅ ✅ 9. CPU Alert Script (Automation)

📜 cpu_alert.sh

✅ ✅ 10. DBA Troubleshooting Flow

🔥 When CPU alert triggers:

1. Check:

2. Identify pattern:

3. Identify process:

4. Map to DB:

5. Fix:

📜 `cpu_health.sh`

📜 `cpu_alert.sh`

📜 Script: `disk_alert_mail.sh`

📜 Script: `disk_to_db_map.sh`

📜 `ultimate_disk_analyzer.sh`