Monday, April 27, 2026

Troubleshoot storage I/O performance issue -- Linux , Oracle


Explain end‑to‑end explanation of the command :

    ps -eo pid,stat,comm | grep D


This is a process inspection command used heavily by Linux, Unix, and database administrators for system and performance troubleshooting.


1️⃣ What is ps?

ps stands for Process Status.
It reports information about currently running processes on a Linux system.

Think of it as a snapshot of processes at the moment you run the command.

📌 Unlike top or htop, ps:

  • Is not interactive
  • Shows a point‑in‑time view
  • Is ideal for scripting and diagnostics

2️⃣ Command Breakdown

ps -eo pid,stat,comm

Let’s split it into parts:


🔹 ps

Invokes the process status utility.


🔹 -e option (select processes)

-e

Means:
Show all processes running on the system

Without -e, ps would only show processes tied to the current terminal (TTY).

Equivalent options:

ps -e
ps -A

All mean “every process”.


🔹 -o option (custom output format)

-o pid,stat,comm

Means:
Choose which columns to display

Instead of default columns, you explicitly request:

FieldMeaning
pidProcess ID
statProcess state
commCommand name (executable)

This is extremely useful for focused troubleshooting.


3️⃣ Output Columns (Explained in Depth)

🔸 PID — Process ID

Example:

24567
  • Unique identifier for a process
  • Assigned by the Linux kernel
  • Required to manage or inspect processes

Used in commands like:

kill 24567
strace -p 24567
cat /proc/24567/status

📌 Notes:

  • PID 1 is always the init/systemd process
  • PIDs are reused after processes exit

🔸 STAT — Process State (most important field)

The STAT column shows:

  1. Main execution state
  2. Additional flags

Primary states

CodeMeaning
RRunning or runnable (on CPU or ready)
SSleeping (waiting for event)
DUninterruptible sleep (I/O wait)
TStopped (signal or debugger)
ZZombie (dead, not cleaned up)
IIdle kernel thread (newer kernels)

👉 The first letter is the core state.


Modifier flags (can appear after the main letter)

FlagMeaning
sSession leader
lMultithreaded (uses threads)
+Foreground process
<High priority
NLow priority

STAT examples explained

Ss
  • S → sleeping
  • s → session leader
    ✅ Normal background service
Ssl+
  • Sleeping
  • Session leader
  • Multithreaded
  • Foreground task
    ✅ Common for DB or Java processes
D

🚨 Critical

  • Process waiting on kernel I/O
  • Cannot be killed (even kill -9)
  • Usually due to:
    • Disk I/O
    • NFS
    • SAN / ASM
    • Kernel storage issue

Examples

Ss

→ Sleeping, session leader

D

→ Blocked on I/O (disk, NFS, storage). Very important state

Ssl+

→ Sleeping, session leader, multithreaded, foreground job

📌 Critical note
If a process is in D state, it:

  • Cannot be killed (kill -9 won’t work)
  • Is usually waiting on disk, SAN, ASM, or NFS
  • Indicates storage or kernel-level issues


🔸 COMMAND — Executable Name

Example:

oracle
sshd
ora_w00l
  • Shows only the binary name
  • Does NOT include command‑line arguments

For full command line:

ps -eo pid,stat,cmd

📌 Oracle example:

ora_w00l

Means:

  • ora_ → Oracle process
  • w00l → Parallel/worker process

4️⃣ Sample Output and Interpretation

PID STAT COMMAND
1 Ss systemd
1023 Ssl oracle
2045 D ora_dbw0

How to read this:

  • systemd → sleeping session leader (normal)
  • oracle → sleeping, multithreaded (normal)
  • ora_dbw0D state (problem)
    → Indicates disk or ASM issue

5️⃣ Why this command is widely used

✅ Lightweight and fast

  • No interactive overhead
  • Safe on production systems

✅ Perfect for troubleshooting

  • Detects:
    • Hung processes
    • Storage stalls
    • Zombie accumulation
    • Oracle background issues

✅ Script‑friendly

Used inside:

  • Shell scripts
  • Health checks
  • Cron jobs

6️⃣ Common Enhancements

Show only blocked (D) processes

ps -eo pid,stat,comm | awk '$2 ~ /D/'


Sort by process state

ps -eo pid,stat,comm --sort=stat

Add user and CPU usage

ps -eo user,pid,stat,%cpu,%mem,comm


7️⃣ Practical Use Case (Oracle / DB servers)

DBAs frequently use:

ps -eo pid,stat,comm | grep ora_

To detect:

  • Stuck background workers
  • DBWR/LGWR waiting on disk
  • Parallel query stalls

If many ora_* processes show D: 🚨 Storage team must be involved immediately


✅ Final Summary

ComponentPurpose
psShow process snapshot
-eInclude all processes
-oCustomize output
pidProcess identifier
statExecution + wait state
commExecutable name

🎯 Key troubleshooting signal

  • R, S → Normal
  • DI/O or kernel problem
  • Z → Parent process issue


  • PID → Unique process identifier
  • STAT → Current state + extra flags (critical for troubleshooting)
  • COMMAND → Executable name

🎯 For troubleshooting:

  • R / S → Normal
  • DInvestigate immediately
  • Z → Parent process issue

No comments:

Post a Comment

Production Server/Database/Application troubleshooting Runbook for Issue like CPU, Memory, I/o , Kernel

  0️⃣ Runbook Objectives This runbook helps you: ✅ Quickly identify CPU, I/O, memory, or process issues ✅ Correlate OS metrics with database...