Oracle Database CPU Performance Troubleshooting (RHEL)
1. Typical CPU Performance Symptoms
- High CPU utilization (near 100%)
- Load average increases rapidly
- Slow query response while disk latency is normal
- AWR shows:
DB CPUCPU timeresmgr:cpu quantum
- OS
%idlevery low and%walow (CPU, not IO)
2. Step‑1: Confirm CPU Pressure at OS Level
✅ Check load vs CPU cores
Interpretation
- Load average ≤ CPU cores → system OK
- Load average >> CPU cores → CPU contention
Example:
load average: 48.21, 45.12, 40.90
CPU cores: 24
➡️ CPU saturation confirmed
3. Step‑2: Identify CPU Utilization & Wait
✅ top (quick overview)Focus on top line:
%Cpu(s): 92.4 us, 3.1 sy, 0.2 ni, 2.0 id, 1.5 wa
CPU‑bound system
%us + %syhigh%idnear 0%walow
✅ vmstat – verify run queue
Key columns:
| Column | Meaning |
|---|---|
r | Runnable processes |
us | User CPU |
sy | System CPU |
id | Idle |
wa | IO wait |
If r > CPU cores consistently → CPU contention
4. Step‑3: Identify Top CPU Consumers
✅ Per‑process CPU usage
Look for:
- Oracle foreground sessions
- Oracle background processes (
ora_*) - Non‑DB processes stealing CPU
✅ Real‑time per‑process view
This shows Oracle threads consuming CPU.
5. Step‑4: Per‑CPU & Core Imbalance
✅ mpstat (very important on NUMA)
Look for:
- Some CPUs at 100%
- Others mostly idle
➡️ Indicates:
- NUMA imbalance
- CPU affinity or pinning problem
6. Step‑5: Context Switching & Syscall Overhead
✅ pidstat – CPU & context switches
High:
cswch/s(context switches)nvcswch/s
➡️ Too many active sessions or excessive parsing
7. Step‑6: Oracle Side – Is Database the CPU Consumer?
✅ CPU usage inside Oracle
Check ratio:
DB CPUclose to total DB time → CPU bottleneck- IO waits dominant → not CPU issue
8. Step‑7: Identify Oracle Sessions Using CPU
✅ Active CPU sessions
🔁 Correlate SPID with OS:
top -p <spid>
9. Step‑8: SQL Responsible for CPU Usage
✅ Top CPU SQL
Red flags:
- High
avg_cpu - Low executions but high total CPU
- Cartesian joins
- Missing indexes
10. Step‑9: Run Queue vs CPU Throttling
✅ Check cgroups / VM CPU limits
Quota < total cores → CPU throttling
11. Step‑10: Oracle Resource Manager (Very Common)
✅ Check RM activation
✅ CPU waits due to RM
If high:
- Sessions are being CPU throttled intentionally
12. Common CPU Root Causes
| Cause | Symptom |
|---|---|
| Bad execution plan | Single SQL consumes most CPU |
| Missing index | Full scans |
| High parse rate | Hard parsing |
| Too many sessions | Context switching |
| CPU oversubscription | VM / container CPU cap |
| NUMA imbalance | Some CPUs overloaded |
13. Immediate Mitigation Actions
✅ Short‑term:
- Kill runaway sessions
- Reduce parallelism
- Disable unnecessary jobs
- Temporarily increase CPU shares
✅ Medium‑term:
- SQL tuning (indexes, hints)
- Bind variables
- Increase cursor cache
- Enable Result Cache (where valid)
✅ Long‑term:
- SQL baselines
- CPU capacity increase
- NUMA pinning
- Resource Manager redesign
14. One‑Shot CPU Triage Command Set
Oracle:
15. CPU vs IO – Golden Rule
| Observation | Root Cause |
|---|---|
High %us, low %wa | CPU bottleneck |
| High load, CPUs idle | IO bottleneck |
High DB CPU in AWR | SQL tuning required |
| RM waits | Resource controls |
No comments:
Post a Comment