Tuesday, May 26, 2026

Map Actual storage disk mount point to troubleshoot the storage related performance issue

🔗 1. Map `dm-*` → Actual Mount Points (VERY IMPORTANT)

✅ Command:

lsblk -o NAME,KNAME,MOUNTPOINT,SIZE,FSTYPE | grep dm-

✅ If using LVM:

dmsetup ls --tree

✅ Detailed mapping:

lsblk -o NAME,KNAME,PKNAME,MOUNTPOINT | column -t

✅ Correlate with filesystem:

df -h | grep /dev/mapper

👉 Why this matters

dm-62 → logical volume → mount point → DB datafile location
Helps answer:
“Which tablespace is causing this spike?”

🔗 2. Map Disk → Oracle / DB Files

✅ For Oracle:

SELECT file_name, tablespace_name
FROM dba_data_files
WHERE file_name LIKE '%<mount_point>%';

✅ Check temp / redo:

SELECT name FROM v$tempfile;

SELECT member FROM v$logfile;

👉 Now you can map:

dm-69 → /u02 → USERS tablespace → full scan

📊 3. iostat → DB Wait Event Mapping

iostat Pattern	DB Wait Event	Meaning
High rMB/s + large avgrq-sz	`db file scattered read`	Full table scan
High r/s small IO	`db file sequential read`	Index lookup
High w_await	`log file sync`	Commit latency
High avgqu-sz	`free buffer waits`	DB buffer pressure
High await	Any IO wait	Storage slow

🚨 4. Alert Thresholds (Production Standard)

✅ Disk Health Thresholds

Metric	Warning	Critical
%util	>80%	>90%
await	>20 ms	>50 ms
avgqu-sz	>5	>10
svctm vs await gap	noticeable	large gap

✅ Quick Alert Command

iostat -xm 2 5 | awk '
/Device/ {print; next}
$1 ~ /^(sd|dm)/ && ($NF > 90 || $10 > 50 || $9 > 10) {print}

👉 Triggers when:

%util > 90
await > 50 ms
queue > 10

📈 5. Real-Time Monitoring Script (Reusable)

✅ Save as `disk_monitor.sh`

#!/bin/bash
echo "==== Disk Bottleneck Check ===="
iostat -xm 2 3 | awk '
/Device/ {print; next}
$1 ~ /^(sd|dm)/ && ($NF > 90 || $10 > 50 || $9 > 10) {
printf "ALERT: %-8s util=%s%% await=%sms queue=%s\n", $1, $NF, $10, $9

✅ Run:

chmod +x disk_monitor.sh

./disk_monitor.sh

🔍 6. Identify Top IO Consumers

✅ Process level:

iotop -oP

✅ File level:

lsof | grep <device>

✅ Per process IO:

pidstat -d 2

🧠 7. Advanced DBA Analysis Flow

When you see:

%util = 100
await = high
avgqu-sz = high

✅ Follow this sequence:

Identify disk
Map to mount point
Map to DB file
Identify SQL causing IO
Check execution plan

⚡ 8. Quick Root Cause Patterns

🔴 Pattern 1 (Your case earlier)

avgrq-sz ~ 1024 KB
await ~ 80–100 ms

👉 Cause:

Full table scans
Data warehouse queries

🟢 Pattern 2

r/s high + avgrq-sz small + low await

👉 Cause:

OLTP workload (healthy)

🔴 Pattern 3

w_await high

👉 Cause:

Commit issues
Log sync bottleneck

🎯 9. What You Should Do Immediately (From Your Data)

Based on your earlier output:

🚨 Critical disks:

dm-69
dm-275

✅ Action Plan:

Map these disks → mount point
Identify DB objects
Run:

SELECT sql_id, executions, disk_reads
ORDER BY disk_reads DESC FETCH FIRST 10 ROWS ONLY;

✅ Final Takeaway

You now have:

✅ Disk → Mount → DB mapping
✅ Alert thresholds
✅ Real-time monitoring script
✅ DB wait correlation
✅ Troubleshooting workflow

iostat linux command deep drive to troubleshooting the performance issue

iostat -xm 2 5 | awk '$1 ~ /^(sd|dm)/ && $NF > 40 {printf "%-10s %s\n",$1,$NF"%"}'

iostat -xm 2 5 | awk '$NF > 40 {print}'

iostat -xm 2 5 | awk '/Device/ {print; next}$1 ~ /^(sd|dm)/ && $NF > 90 {print}'

📌 Header Breakdown (Deep Explanation)

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util

✅ 1. Device

Logical or physical disk name
- sdX → physical disks
- dm-X → device mapper (LVM, ASM, multipath)

👉 In your case:

dm-* = logical volumes / DB storage layers

✅ 2. rrqm/s (Read Requests Merged per second)

Number of read requests merged by OS scheduler

Why merging matters:

OS combines adjacent reads to reduce I/O calls

👉 Example:

10 small reads → merged → 1 large read

✅ Interpretation:

High value → efficient sequential I/O
Zero → either random I/O or already optimized

✅ 3. wrqm/s (Write Requests Merged per second)

Same as above but for writes

✅ High value:

Good for sequential writes (e.g., redo logs, batch loads)

✅ 4. r/s (Reads per second)

Number of read I/O operations per second

Interpretation:

High r/s = high IOPS (random access likely)

✅ 5. w/s (Writes per second)

Number of write operations per second

👉 Together with r/s:

Indicates workload type:
- OLTP → high r/s + w/s, small IO
- Analytics → lower r/s but large I/O size

✅ 6. rMB/s (Read throughput in MB/sec)

Total data read per second

✅ 7. wMB/s (Write throughput in MB/sec)

Total data written per second

🔎 Important:

Pattern	Meaning
High r/s + low rMB/s	small random IO
Low r/s + high rMB/s	large sequential IO

✅ 8. avgrq-sz (Average Request Size)

Average size of each I/O request (in KB)

Formula:

avgrq-sz = (total sectors read+written) / total I/O ops

Interpretation:

Value	Meaning
< 32 KB	random IO (OLTP)
64–256 KB	mixed
~1024 KB (1MB)	sequential scan

✅ 9. avgqu-sz (Average Queue Length)

Number of I/O requests waiting in queue

🚨 Critical metric:

Value	Impact
< 1	healthy
1–5	moderate
10+	pressure
20+	severe bottleneck

👉 High value means:

Disk is overloaded
Requests are waiting → latency increase

✅ 10. await (Average Wait Time in ms)

Total time for I/O request:
```
wait time = queue time + service time
```

🚨 Thresholds:

Value	Meaning
< 5 ms	excellent
5–20 ms	acceptable
20–50 ms	warning
> 50 ms	serious issue

👉 This is the most important latency metric

✅ 11. r_await (Read latency)

Avg time for read requests

✅ 12. w_await (Write latency)

Avg time for write requests

Why split matters:

Helps identify:
- read-heavy issues (full scan)
- write bottlenecks (redo/log/file sync)

✅ 13. svctm (Service Time)

Time taken by disk to service request
Does NOT include queue time

Important:

await ≈ svctm + queue delay

Interpretation:

Case	Meaning
await ≈ svctm	no queue bottleneck
await >> svctm	queue contention

👉 This is key for bottleneck detection

✅ 14. %util (Utilization)

Percentage of time disk was busy

🚨 Interpretation:

Value	Meaning
< 60%	safe
60–80%	moderate
80–90%	high
> 90%	saturated

👉 BUT:

Must combine with await + queue

🔥 Important Combined Interpretation

✅ Case 1 (Healthy high usage)

%util = 95%
await = 1 ms
avgqu-sz = 1

✔ Efficient disk

🚨 Case 2 (Bottleneck)

%util = 99%
await = 80 ms
avgqu-sz = 20

❌ Disk saturation + queue buildup

🧠 How You Should Read Header (DBA Cheat Sheet)

Step-by-step analysis:

Check %util
- 90 → possible saturation
Check avgqu-sz
- High → queue backlog
Check await
- Confirms latency impact
Compare await vs svctm
- Big gap → queue delay
Check avgrq-sz
- Understand workload type

🎯 Why This Matters for You (Database Architect)

This header directly helps identify:

✅ DB Issues Mapping

Metric	DB Problem
High rMB/s + large avgrq-sz	full table scan
High r/s, low size	index lookup
High w_await	commit / redo issues
High avgqu-sz	storage contention
High await	slow queries

✅ Final Summary

r/s, w/s → IOPS
rMB/s, wMB/s → throughput
avgrq-sz → IO size (random vs sequential)
avgqu-sz → pressure indicator 🚨
await → real latency 🚨
%util → saturation signal

Monday, May 25, 2026

Step-by-Step HugePages Configuration (Oracle 19c on Linux)

🔹 Step 1: Check Current HugePages Status

grep Huge /proc/meminfo

Key parameters:

HugePages_Total
HugePages_Free
Hugepagesize (usually 2 MB)

🔹 Step 2: Disable Transparent HugePages (THP)

Oracle recommends disabling THP.

Check status:

cat /sys/kernel/mm/transparent_hugepage/enabled

Disable temporarily:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

echo never > /sys/kernel/mm/transparent_hugepage/defrag

Disable permanently:

Edit:

vi /etc/default/grub

Add:

transparent_hugepage=never

Then apply:

grub2-mkconfig -o /boot/grub2/grub.cfg

reboot

🔹 Step 3: Calculate Required HugePages

3.1 Check Oracle SGA size

From SQL:

show parameter sga_target;

or:

show parameter memory_target;

3.2 Use Oracle Script (Recommended)

$ORACLE_HOME/bin/hugepages_settings.sh

If not available, download from Oracle MOS.

3.3 Manual Calculation

Formula:

HugePages = Total_SGA_Size / HugePage_Size

Example:

SGA = 64 GB
Page size = 2 MB

64 * 1024 MB / 2 MB = 32768 HugePages

🔹 Step 4: Configure HugePages in Kernel

Edit:

vi /etc/sysctl.conf

Add/update:

vm.nr_hugepages=32768

Apply:

sysctl -p

🔹 Step 5: Configure memlock for Oracle User

Edit:

vi /etc/security/limits.conf

Add:

oracle soft memlock 67108864

oracle hard memlock 67108864

👉 Value should match SGA size (in KB)

Example:

64 GB = 67108864 KB

🔹 Step 6: Disable AMM (Very Important)

HugePages does NOT work with AMM.

Check:

show parameter memory_target;

If > 0, disable:

ALTER SYSTEM SET memory_target=0 SCOPE=SPFILE;

ALTER SYSTEM SET memory_max_target=0 SCOPE=SPFILE;

Instead, use ASMM:

ALTER SYSTEM SET sga_target=64G SCOPE=SPFILE;

ALTER SYSTEM SET pga_aggregate_target=8G SCOPE=SPFILE;

Restart DB:

shutdown immediate;

startup;

🔹 Step 7: Restart Server

reboot

🔹 Step 8: Validate HugePages Usage

grep Huge /proc/meminfo

Check:

HugePages_Total → should match configured value
HugePages_Free → should decrease after DB start
HugePages_Rsvd → reserved pages

🔹 Step 9: Verify Oracle is Using HugePages

grep -i huge /proc/meminfo

Also check alert log:

grep HugePages $ORACLE_BASE/diag/rdbms/*/*/trace/alert*.log

Expected message:

Huge Pages allocation successful

✅ Best Practices

✔ Always leave some RAM for OS (not 100% HugePages)
✔ Use static SGA (ASMM)
✔ Set HugePages slightly higher than requirement
✔ Monitor with:

vmstat

free -g

⚠️ Common Mistakes

❌ Using AMM (memory_target)
❌ Not setting memlock
❌ THP not disabled
❌ Underestimating number of HugePages

🚀 Quick Example Summary

For a 64 GB SGA:

Setting	Value
HugePage size	2 MB
Required pages	32768
memlock	67108864 KB
vm.nr_hugepages	32768

Oracle Script for huge_page setting

vi hugepages_settings.sh

#!/bin/bash

# hugepages_settings.sh

# Linux bash script to compute values for the

# recommended HugePages/HugeTLB configuration

# on Oracle Linux

# Note: This script does calculation for all shared memory

# segments available when the script is run, no matter it

# is an Oracle RDBMS shared memory segment or not.

# This script is provided by KB151310 from My Oracle Support

# http://support.oracle.com

# Welcome text

echo "

This script is provided by KB151310 from My Oracle Support

(http://support.oracle.com) where it is intended to compute values for

the recommended HugePages/HugeTLB configuration for the current shared

memory segments on Oracle Linux. Before proceeding with the execution please note following:

* For ASM instance, it needs to configure ASMM instead of AMM.

* The 'pga_aggregate_target' is outside the SGA and

you should accommodate this while calculating the overall size.

* In case you changes the DB SGA size,

as the new SGA will not fit in the previous HugePages configuration,

it had better disable the whole HugePages,

start the DB with new SGA size and run the script again.

And make sure that:

* Oracle Database instance(s) are up and running

* Oracle Database Automatic Memory Management (AMM) is not setup

(See KB83222)

* The shared memory segments can be listed by command:

# ipcs -m

Press Enter to proceed..."

read

# Check for the kernel version

KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`

# Find out the HugePage size

HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`

if [ -z "$HPG_SZ" ];then

echo "The hugepages may not be supported in the system where the script is being executed."

exit 1

# Initialize the counter

NUM_PG=0

# Cumulative number of pages required to handle the running shared memory segments

for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`

MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`

if [ $MIN_PG -gt 0 ]; then

NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`

done

RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`

# An SGA less than 100MB does not make sense

# Bail out if that is the case

if [ $RES_BYTES -lt 100000000 ]; then

echo "***********"

echo "** ERROR **"

echo "***********"

echo "Sorry! There are not enough total of shared memory segments allocated for

HugePages configuration. HugePages can only be used for shared memory segments

that you can list by command:

# ipcs -m

of a size that can match an Oracle Database SGA. Please make sure that:

* Oracle Database instance is up and running

* Oracle Database Automatic Memory Management (AMM) is not configured"

exit 1

# Finish with results

echo "Recommended setting: vm.nr_hugepages = $NUM_PG";

# End

Monday, May 18, 2026

step‑by‑step guide to check CPU sockets, cores, threads, and multithreading (Hyper‑Threading/SMT)

Step‑by‑step guide to check CPU sockets, cores, threads, and multithreading (Hyper‑Threading/SMT)

✅ 1. BASIC UNDERSTANDING (IMPORTANT)

Term	Meaning
Socket	Physical CPU installed on motherboard
Core	Physical processing unit inside CPU
Thread	Logical CPU (via Hyper‑Threading / SMT)
Multithreading Enabled?	If Threads > Cores

👉 Example:

1 socket, 4 cores, 8 threads → Hyper‑Threading ENABLED
1 socket, 4 cores, 4 threads → Hyper‑Threading DISABLED

🐧 2. LINUX – STEP BY STEP

🔹 Step 1: Check CPU details

Run:

lscpu

Example output:

CPU(s):              8
Socket(s):           1
Core(s) per socket:  4
Thread(s) per core:  2

Interpret:

Sockets = 1
Cores = 4
Threads = 8
Thread per core = 2 → Multithreading ENABLED ✅

🔹 Step 2: Check using /proc

cat /proc/cpuinfo | grep "physical id" | sort -u | wc -l

👉 Gives number of sockets

cat /proc/cpuinfo | grep "cpu cores" | uniq

👉 Shows cores per socket

cat /proc/cpuinfo | grep "processor" | wc -l

👉 Total logical CPUs (threads)

🔹 Step 3: Quick single command summary

nproc

👉 gives total threads

🔹 Step 4: Check Hyper‑Threading explicitly

lscpu | grep "Thread"

If:

Thread(s) per core: 2

👉 ✅ Hyper‑Threading ON

If:

Thread(s) per core: 1

👉 ❌ OFF

🪟 3. WINDOWS – STEP BY STEP

🔹 Step 1: Task Manager (GUI method)

Press Ctrl + Shift + Esc
Go to Performance tab → CPU
Look at:

Sockets: X
Cores: Y
Logical processors: Z

👉 Example:

Cores = 4
Logical processors = 8
✅ Multithreading ENABLED

🔹 Step 2: Command Prompt

Run:

wmic cpu get NumberOfCores,NumberOfLogicalProcessors

Output:

NumberOfCores  NumberOfLogicalProcessors
4              8

👉 Threads (LogicalProcessors) > Cores
✅ Hyper‑Threading ON

🔹 Step 3: PowerShell (recommended)

Get-WmiObject Win32_Processor | Select NumberOfCores, NumberOfLogicalProcessors, SocketDesignation

OR modern cmdlet:

Get-CimInstance Win32_Processor | Select NumberOfCores,NumberOfLogicalProcessors

🔹 Step 4: System Information

Run:

msinfo32

Check:

Processor
Logical processors

🔍 4. HOW TO TELL IF MULTITHREADING IS ENABLED

✅ Rule:

If Threads > Cores → ENABLED
If Threads = Cores → DISABLED

Example:

Cores	Threads	Status
4	8	✅ Enabled
8	16	✅ Enabled
4	4	❌ Disabled

🔧 5. BIOS LEVEL CHECK (Important)

Even if OS shows cores/threads, final control is in BIOS.

Steps:

Reboot system
Enter BIOS (F2 / DEL / ESC)
Look for:
- Intel Hyper‑Threading
- AMD SMT (Simultaneous Multithreading)

Settings:

Enabled ✅ → uses threads
Disabled ❌ → only physical cores

✅ 6. QUICK CHEAT SHEET

Linux

lscpu

Windows

wmic cpu get NumberOfCores,NumberOfLogicalProcessors

🚀 FINAL SUMMARY

Platform	Command	What to Check
Linux	`lscpu`	Threads per core
Linux	`/proc/cpuinfo`	cores, sockets
Windows	Task Manager	Logical processors
Windows	`wmic`	cores vs threads

SQL> SELECT stat_name, value

FROM v$osstat

WHERE stat_name IN ('NUM_CPUS','NUM_CPU_CORES','NUM_CPU_SOCKETS');

`` 2 3

STAT_NAME VALUE

------------------------------ ----------

NUM_CPUS 8

NUM_CPU_CORES 8

NUM_CPU_SOCKETS 2

SQL>

Interpretation:

NUM_CPUS = logical CPUs visible to the OS (includes HT threads)

NUM_CPU_CORES = physical cores

If NUM_CPUS > NUM_CPU_CORES → SMT/HT enabled

If NUM_CPUS = NUM_CPU_CORES → SMT/HT disabled

SQL> !cat /proc/cpuinfo | egrep "processor|cpu cores|siblings|physical id"

physical id : 0

siblings : 4

cpu cores : 4

processor : 1

cpu cores = physical cores per socket

siblings = logical CPUs per socket

If siblings > cpu cores → SMT/HT enabled

If siblings == cpu cores → SMT/HT disabled

Tuesday, May 26, 2026

Map Actual storage disk mount point to troubleshoot the storage related performance issue

🔗 1. Map dm-* → Actual Mount Points (VERY IMPORTANT)

✅ Command:

✅ If using LVM:

✅ Detailed mapping:

✅ Correlate with filesystem:

🔗 2. Map Disk → Oracle / DB Files

✅ For Oracle:

✅ Check temp / redo:

📊 3. iostat → DB Wait Event Mapping

🚨 4. Alert Thresholds (Production Standard)

✅ Disk Health Thresholds

✅ Quick Alert Command

📈 5. Real-Time Monitoring Script (Reusable)

✅ Save as disk_monitor.sh

✅ Run:

🔍 6. Identify Top IO Consumers

✅ Process level:

✅ File level:

✅ Per process IO:

🧠 7. Advanced DBA Analysis Flow

✅ Follow this sequence:

⚡ 8. Quick Root Cause Patterns

🔴 Pattern 1 (Your case earlier)

🟢 Pattern 2

🔴 Pattern 3

🎯 9. What You Should Do Immediately (From Your Data)

🚨 Critical disks:

✅ Action Plan:

✅ Final Takeaway

iostat linux command deep drive to troubleshooting the performance issue

📌 Header Breakdown (Deep Explanation)

✅ 1. Device

✅ 2. rrqm/s (Read Requests Merged per second)

Why merging matters:

✅ 3. wrqm/s (Write Requests Merged per second)

✅ 4. r/s (Reads per second)

Interpretation:

✅ 5. w/s (Writes per second)

✅ 6. rMB/s (Read throughput in MB/sec)

✅ 7. wMB/s (Write throughput in MB/sec)

🔎 Important:

✅ 8. avgrq-sz (Average Request Size)

Formula:

Interpretation:

✅ 9. avgqu-sz (Average Queue Length)

🚨 Critical metric:

✅ 10. await (Average Wait Time in ms)

🚨 Thresholds:

✅ 11. r_await (Read latency)

✅ 12. w_await (Write latency)

Why split matters:

✅ 13. svctm (Service Time)

Important:

Interpretation:

✅ 14. %util (Utilization)

🚨 Interpretation:

🔥 Important Combined Interpretation

✅ Case 1 (Healthy high usage)

🚨 Case 2 (Bottleneck)

🧠 How You Should Read Header (DBA Cheat Sheet)

Step-by-step analysis:

🎯 Why This Matters for You (Database Architect)

✅ DB Issues Mapping

✅ Final Summary

Monday, May 25, 2026

Step-by-Step HugePages Configuration (Oracle 19c on Linux)

Step-by-Step HugePages Configuration (Oracle 19c on Linux)

🔹 Step 1: Check Current HugePages Status

🔹 Step 2: Disable Transparent HugePages (THP)

Check status:

Disable temporarily:

Disable permanently:

🔹 Step 3: Calculate Required HugePages

3.1 Check Oracle SGA size

3.2 Use Oracle Script (Recommended)

3.3 Manual Calculation

🔹 Step 4: Configure HugePages in Kernel

🔹 Step 5: Configure memlock for Oracle User

🔗 1. Map `dm-*` → Actual Mount Points (VERY IMPORTANT)

✅ Save as `disk_monitor.sh`