Showing posts with label 26ai. Show all posts

Monday, May 25, 2026

Step-by-Step HugePages Configuration (Oracle 19c on Linux)

🔹 Step 1: Check Current HugePages Status

grep Huge /proc/meminfo

Key parameters:

HugePages_Total
HugePages_Free
Hugepagesize (usually 2 MB)

🔹 Step 2: Disable Transparent HugePages (THP)

Oracle recommends disabling THP.

Check status:

cat /sys/kernel/mm/transparent_hugepage/enabled

Disable temporarily:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

echo never > /sys/kernel/mm/transparent_hugepage/defrag

Disable permanently:

Edit:

vi /etc/default/grub

Add:

transparent_hugepage=never

Then apply:

grub2-mkconfig -o /boot/grub2/grub.cfg

reboot

🔹 Step 3: Calculate Required HugePages

3.1 Check Oracle SGA size

From SQL:

show parameter sga_target;

or:

show parameter memory_target;

3.2 Use Oracle Script (Recommended)

$ORACLE_HOME/bin/hugepages_settings.sh

If not available, download from Oracle MOS.

3.3 Manual Calculation

Formula:

HugePages = Total_SGA_Size / HugePage_Size

Example:

SGA = 64 GB
Page size = 2 MB

64 * 1024 MB / 2 MB = 32768 HugePages

🔹 Step 4: Configure HugePages in Kernel

Edit:

vi /etc/sysctl.conf

Add/update:

vm.nr_hugepages=32768

Apply:

sysctl -p

🔹 Step 5: Configure memlock for Oracle User

Edit:

vi /etc/security/limits.conf

Add:

oracle soft memlock 67108864

oracle hard memlock 67108864

👉 Value should match SGA size (in KB)

Example:

64 GB = 67108864 KB

🔹 Step 6: Disable AMM (Very Important)

HugePages does NOT work with AMM.

Check:

show parameter memory_target;

If > 0, disable:

ALTER SYSTEM SET memory_target=0 SCOPE=SPFILE;

ALTER SYSTEM SET memory_max_target=0 SCOPE=SPFILE;

Instead, use ASMM:

ALTER SYSTEM SET sga_target=64G SCOPE=SPFILE;

ALTER SYSTEM SET pga_aggregate_target=8G SCOPE=SPFILE;

Restart DB:

shutdown immediate;

startup;

🔹 Step 7: Restart Server

reboot

🔹 Step 8: Validate HugePages Usage

grep Huge /proc/meminfo

Check:

HugePages_Total → should match configured value
HugePages_Free → should decrease after DB start
HugePages_Rsvd → reserved pages

🔹 Step 9: Verify Oracle is Using HugePages

grep -i huge /proc/meminfo

Also check alert log:

grep HugePages $ORACLE_BASE/diag/rdbms/*/*/trace/alert*.log

Expected message:

Huge Pages allocation successful

✅ Best Practices

✔ Always leave some RAM for OS (not 100% HugePages)
✔ Use static SGA (ASMM)
✔ Set HugePages slightly higher than requirement
✔ Monitor with:

vmstat

free -g

⚠️ Common Mistakes

❌ Using AMM (memory_target)
❌ Not setting memlock
❌ THP not disabled
❌ Underestimating number of HugePages

🚀 Quick Example Summary

For a 64 GB SGA:

Setting	Value
HugePage size	2 MB
Required pages	32768
memlock	67108864 KB
vm.nr_hugepages	32768

Oracle Script for huge_page setting

vi hugepages_settings.sh

#!/bin/bash

# hugepages_settings.sh

# Linux bash script to compute values for the

# recommended HugePages/HugeTLB configuration

# on Oracle Linux

# Note: This script does calculation for all shared memory

# segments available when the script is run, no matter it

# is an Oracle RDBMS shared memory segment or not.

# This script is provided by KB151310 from My Oracle Support

# http://support.oracle.com

# Welcome text

echo "

This script is provided by KB151310 from My Oracle Support

(http://support.oracle.com) where it is intended to compute values for

the recommended HugePages/HugeTLB configuration for the current shared

memory segments on Oracle Linux. Before proceeding with the execution please note following:

* For ASM instance, it needs to configure ASMM instead of AMM.

* The 'pga_aggregate_target' is outside the SGA and

you should accommodate this while calculating the overall size.

* In case you changes the DB SGA size,

as the new SGA will not fit in the previous HugePages configuration,

it had better disable the whole HugePages,

start the DB with new SGA size and run the script again.

And make sure that:

* Oracle Database instance(s) are up and running

* Oracle Database Automatic Memory Management (AMM) is not setup

(See KB83222)

* The shared memory segments can be listed by command:

# ipcs -m

Press Enter to proceed..."

read

# Check for the kernel version

KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`

# Find out the HugePage size

HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`

if [ -z "$HPG_SZ" ];then

echo "The hugepages may not be supported in the system where the script is being executed."

exit 1

# Initialize the counter

NUM_PG=0

# Cumulative number of pages required to handle the running shared memory segments

for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`

MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`

if [ $MIN_PG -gt 0 ]; then

NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`

done

RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`

# An SGA less than 100MB does not make sense

# Bail out if that is the case

if [ $RES_BYTES -lt 100000000 ]; then

echo "***********"

echo "** ERROR **"

echo "***********"

echo "Sorry! There are not enough total of shared memory segments allocated for

HugePages configuration. HugePages can only be used for shared memory segments

that you can list by command:

# ipcs -m

of a size that can match an Oracle Database SGA. Please make sure that:

* Oracle Database instance is up and running

* Oracle Database Automatic Memory Management (AMM) is not configured"

exit 1

# Finish with results

echo "Recommended setting: vm.nr_hugepages = $NUM_PG";

# End

Friday, April 24, 2026

Oracle Database – Detailed History

1. Origins of Oracle Database (1977–1982)

The Relational Database Idea

The foundation of Oracle Database comes from Dr. Edgar F. Codd’s relational model (1970, IBM).
IBM published research but did not commercialize it immediately.

Oracle Corporation Formation

1977: Larry Ellison, Bob Miner, and Ed Oates founded Software Development Laboratories (SDL).
Objective: build a commercial relational database, inspired by IBM’s System R paper.
Key difference: Oracle targeted multiple platforms, while IBM focused on mainframes.

2. Oracle Version 2 – First Commercial RDBMS (1979)

There was no Oracle Version 1 (marketing choice).

Key Highlights

Oracle V2 (1979) was the first commercially available SQL-based RDBMS.
Written in assembly language.
Ran on Digital VAX/VMS systems.
Supported basic SQL (SELECT, INSERT, UPDATE, DELETE).

Importance

✅ First mover advantage
✅ SQL as a public standard
✅ Database independent of hardware

3. Oracle Version 3 – Portability Revolution (1983)

Major Advancements

Rewritten entirely in C language.
Enabled platform portability (UNIX, VMS, later Windows).
Introduced the concept of Oracle being OS-independent.

Strategic Impact

✅ Oracle could run everywhere
✅ Faster customer adoption
✅ Differentiated sharply from IBM DB2

4. Oracle Version 4 & 5 – Client/Server Era Begins (1984–1987)

Oracle V4

Added basic transaction consistency
Improved data dictionary

Oracle V5

Introduced Client/Server architecture
SQL*Net allowed remote DB access
Enabled database connectivity over networks

5. Oracle Version 6 – Enterprise Scalability (1988)

Game-Changing Features

Row-level locking (vs table-level locking)
Online backups
Read consistency using rollback segments
First steps toward enterprise reliability

✅ Enabled high-concurrency OLTP systems
✅ Became viable for large enterprises

6. Oracle 7 – The Enterprise Database (1992)

Widely regarded as Oracle’s first truly mature enterprise database.

Major Innovations

Cost-Based Optimizer (CBO) introduced
Stored procedures
Triggers
Declarative referential integrity
Shared SQL area
Improved redo and recovery

Business Impact

✅ Massive enterprise adoption
✅ Oracle became dominant in banking, telecom, ERP systems

7. Oracle 8 & 8i – Object & Internet Age (1997–2000)

Oracle 8

Object-relational features
User-defined types
Partitioning introduced
Support for large objects (LOBs)

Oracle 8i (“Internet”)

Native Java inside the database
JVM running inside Oracle
XML support
Improved scalability for web applications

✅ Positioned Oracle as Internet-scale database

8. Oracle 9i – Grid Computing Foundations (2001)

Key Milestones

Real Application Clusters (RAC) reintroduced
Flashback Query
Data Guard (physical & logical standby)
Automatic undo management

Strategic Shift

Oracle introduced Grid Computing:
“A pool of low-cost servers instead of big iron.”

✅ High availability
✅ Horizontal scalability

9. Oracle 10g – Grid Computing Matures (2003)

The “g” literally stood for Grid.

Major Additions

Automatic Storage Management (ASM)
AWR, ADDM
Automatic Memory Management
Data Pump (expdp/impdp)
Enterprise Manager Grid Control

✅ Reduced DBA manual effort
✅ Strong focus on manageability

10. Oracle 11g – Self-Managing Database (2007)

Among the most widely used Oracle versions ever.

Key Features

Adaptive Cursor Sharing
SecureFiles (advanced LOBs)
Active Data Guard
Result Cache
Improved partitioning
Edition-Based Redefinition

Sub-Release 11gR2

RAC improvements
SCAN listeners
Better scalability

✅ Extremely stable
✅ Long enterprise lifecycle

11. Oracle 12c – Cloud & Multitenancy (2013)

The “c” stands for Cloud.

Biggest Architectural Change Ever

Multitenant Architecture

CDB (Container Database)
PDB (Pluggable Databases)
Database consolidation at scale

Other Enhancements

Heat Map
Automatic Data Optimization (ADO)
In-Memory Column Store
JSON support

✅ Cloud-ready architecture
✅ License optimization via consolidation

12. Oracle 18c & 19c – Autonomous Direction (2018–2019)

Oracle 18c

Essentially 12.2 rebranded
Minor functional changes
Marked shift to continuous release model

Oracle 19c (Long-Term Support)

Most stable 12c-based release
Automatic Indexing
Hybrid Partitioned Tables
High adoption worldwide

✅ Widely accepted as production standard

13. Oracle 21c – Innovation Release (2021)

Key Features

Blockchain tables
Native JSON data type
SQL Macros
In-Memory enhancements

⚠️ Short-term innovation release
⚠️ Not widely used for mission-critical production

14. Oracle 23c / 23ai – Modern Data Platform (2023–Present)

Focus Areas

AI/ML integration
JSON-Relational Duality
Sharding improvements
Microservices-friendly architecture
Vector data support
Cloud-native optimization

Strategic Direction

Autonomous Database
Oracle Cloud Infrastructure (OCI) first
Database as a managed service

✅ Designed for AI-driven and cloud-native workloads

15. Oracle Database Today – Strategic Position

Key Strengths

Mission-critical OLTP
High availability (RAC, Data Guard, Autonomous)
Security & compliance
Extreme scalability

Challenges

Competition from:
- PostgreSQL
- MySQL
- Cloud-native databases
Licensing complexity

16. Transition to the AI-Native Database Era

Oracle historically names database releases after major technology shifts:

Release	Meaning
9i	Internet
10g	Grid computing
12c	Cloud computing
23ai	Artificial Intelligence
26ai	AI‑native, agentic, multimodal data

17. Oracle Database 23ai (2024–Present)

Formerly Oracle Database 23c
Renamed to 23ai to reflect AI as a core capability, not an add-on

Release Classification

Long-Term Support (LTS) release
Premier Support until 2031
Successor to 19c in production roadmap

Summary Timeline

Era	Focus
1979–1987	Relational foundation
1988–1996	Enterprise OLTP
1997–2003	Internet & RAC
2004–2012	Grid & automation
2013–2018	Cloud & multitenant
2019–Now	Autonomous, AI, cloud-native

Wednesday, April 15, 2026

How Availability Numbers Are “Massaged” in SLAs ?

1. How Availability Numbers Are “Massaged” in SLAs

99.9999% is usually not measured the way engineers think it is.

Vendors almost never measure true end‑to‑end availability.

1.1 The Raw Formula (What Engineers Assume)

Availability = \frac{Total Time - Downtime}{Total Time}

For 99.9999%, downtime budget:

31.5 seconds / year

1.2 What SLAs Quietly EXCLUDE (Very Important)

Most SLAs exclude downtime caused by:

Excluded Category	Examples
Planned maintenance	PSU patches, GI upgrades
Customer actions	Bad SQL, dropped tables
Dependency failures	Network, DNS, IAM
DR tests	Switchover drills
Partial outages	One node down but cluster “up”
Performance degradation	Slow ≠ down

📌 Result:
The SLA uptime looks amazing, while users still experience outages.

2. “Availability of What?” (Classic SLA Trick)

SLA usually measures:

✅ Database process running

Business measures:

✅ Transaction success

These are not the same.

Example

Situation	SLA View	User View
RAC node eviction	DB is UP	Users get errors
GC contention	DB UP	App timing out
ADG apply lag	Primary UP	Data inconsistent
App pool exhaustion	DB UP	System down

📌 Availability ≠ Usability

3. Mapping Oracle Events to Downtime Consumption (Realistic)

Let’s assume a 99.9999% target (31.5 sec/year).

3.1 Oracle RAC Events

Event	Typical Impact	Downtime Budget Burn
Instance crash	5–30 sec	Yearly budget gone
Node eviction	20–60 sec	SLA violated
CRS restart	1–3 min	SLA blown
Cache reconfiguration	Milliseconds–seconds	Daily budget gone

✅ RAC improves availability
❌ RAC alone cannot hold six‑nines

3.2 Data Guard / FSFO Events

Event	Time
FSFO detection	5–10 sec
Failover execution	10–30 sec
App reconnect	5–20 sec

🔴 Total: 20–60 seconds
🔴 Already exceeds 99.9999% annual allowance

3.3 Planned Events (Usually “Excluded”)

Activity	Real Impact
Rolling patch	Latency spikes
Switchover	Session drops
Backup I/O	Performance dip

Yet SLAs say: “No downtime occurred.”

4. Why Six‑Nines+ Stops Being a DB Metric

Once you cross five‑nines, availability is dominated by:

Application retry logic
Connection pool behavior
Graceful error handling
Client perception

📌 At this level, DB uptime is necessary but insufficient.

5. Correct Way to Measure Availability (Mature Orgs)

Instead of raw uptime, elite teams measure:

Metric	Why It Matters
Successful transactions %	Real availability
Mean error rate	User impact
RTO (seconds)	Recovery speed
RPO (zero/near-zero)	Data safety
Error‑free deployments	Ops maturity

6. Architect‑Grade Statement (Use This)

You can safely say in reviews or audits:

“Availability percentages above five‑nines are typically achieved by excluding planned maintenance and partial failures. For stateful databases like Oracle, true end‑to‑end availability should be measured using transaction success and recovery objectives rather than SLA uptime alone.”

7. Executive Translation (Very Powerful)

“The system may technically be ‘up’, but availability is defined by whether customers can complete transactions without errors.”

8. Final Mental Model (Remember This)

99.9%     → Infrastructure resilience
99.99%    → Platform resilience
99.999%   → Automation maturity
99.9999%+ → Application experience

How to calculate time based on "Nines" SLA

1. The Core Formula (This Is the Only Formula Used)

Downtime = Total Time \times (1 - Availability)

Where:

Availability is written as a decimal
(e.g., 99.9999% ⇒ 0.999999)
Total Time is expressed in the unit you care about
(year, month, day, etc.)

2. Convert 9.9999% Correctly (Common Mistake)

99.9999% is NOT 9.9999

Correct conversion:

99.9999 % = \frac{99.9999}{100} = 0.999999

Downtime fraction:

1 - 0.999999 = 0.000001

👉 That’s one‑millionth of the time window

3. Total Time in One Year

A standard year:

365 days \times 24 \times 60 \times 60

= 31, 536, 000 seconds

4. Downtime Calculation for 99.9999%

Downtime per year = 31, 536, 000 \times 0.000001

= 31.536 seconds per year

✅ Final Answer (Core Result)

99.9999% availability allows:

31.5 seconds of downtime per year
~2.6 seconds per month
~0.086 seconds per day

5. Year / Month / Day Breakdown

Time Period	Allowed Downtime
Year	31.5 seconds
Month (30 days)	~2.6 seconds
Week	~0.6 seconds
Day	~0.086 seconds

📌 Meaning: A single Oracle cluster reconfiguration already burns the entire daily budget.

6. Comparison Across “Nines” (For Perspective)

Availability	Downtime / Year
99.9%	8.76 hours
99.99%	52.6 minutes
99.999%	5.26 minutes
99.9999%	31.5 seconds
99.99999%	3.15 seconds
99.999999%	0.315 seconds

7. Architect Reality Check (Very Important)

At 99.9999%:

One:
- RAC rebalance
- Failover detection
- Network flap
- Patch‑related pause
Exceeds the daily or monthly budget

👉 That’s why six‑nines and above are application‑experience claims, not database SLAs.

8. Interview / Design‑Review Ready Statement

You can safely say:

“99.9999% availability mathematically permits only 31.5 seconds of downtime per year. At this level, even automated failovers, cluster reconfigurations, or planned maintenance windows must be treated as availability‑impacting events.”

9. One‑Line Formula You Can Memorize

Monday, April 13, 2026

HA (High Availability ) vs DR (Disaster Recovery) – What’s the Difference ?

HA vs DR – What’s the Difference?

HA and DR solve different problems.
Many outages happen because teams assume one replaces the other.

1. Simple One‑Line Difference (Easy to Remember)

Aspect	High Availability (HA)	Disaster Recovery (DR)
Purpose	Survive local failures	Survive site‑level disasters
Scope	Same data center / region	Different data center / region
Downtime	Seconds to minutes	Minutes to hours
Data Loss	None	Low to none
Automation	Very high	Medium to high

📌 Key rule

HA handles “small failures often”
DR handles “big failures rarely”

2. High Availability (HA) – Deep Explanation

✅ What HA Protects Against

Database instance crash
Node / VM failure
OS kernel panic
Network card failure
Storage path failure

❌ HA does NOT protect against

Data center fire/flood
Power grid failure
Region‑wide network outage
Human error affecting entire site

3. Oracle HA – How It Works

Example: Oracle RAC (Classic HA)

Users
  │
Load Balancer
  │
┌───────────────┐
│ Oracle RAC    │  Same Data Center
│ Node 1        │
│ Node 2        │
│ Shared Storage│
└───────────────┘

What Happens During Failure?

Node 1 crashes
Node 2 continues serving traffic
Sessions failover automatically
Downtime: seconds

✅ This is High Availability

Oracle HA Tools

Oracle RAC
Oracle Restart
ASM redundancy
FAN / TAF
Application Continuity

HA Metrics

RTO: Seconds
RPO: Zero
Geography: Single site

4. Disaster Recovery (DR) – Deep Explanation

✅ What DR Protects Against

Data center outage
Fire, flood, earthquake
Power grid failure
Ransomware
Massive human error

❌ DR does NOT protect against

Single node crash (too slow)
Local HA events

5. Oracle DR – How It Works

Example: Oracle Data Guard

Primary Data Center
┌────────────────────┐
│ Oracle DB Primary  │
└─────────┬──────────┘
          │ Redo Apply
DR Data Center
┌─────────▼──────────┐
│ Oracle Standby DB  │
└────────────────────┘

What Happens During Failure?

Primary site is lost
Standby is activated
Applications reconnect
Downtime: minutes

✅ This is Disaster Recovery

Oracle DR Tools

Oracle Data Guard (sync/async)
Active Data Guard
Fast‑Start Failover (FSFO)
RMAN backups (last resort)

DR Metrics

RTO: Minutes–Hours
RPO: Seconds–Minutes
Geography: Separate site / region

6. HA vs DR – Side‑by‑Side Technical Comparison

Dimension	HA	DR
Distance	Meters	Kilometers
Failure Frequency	High	Low
Automation	Automatic	Semi/automatic
Cost	Medium	High
Complexity	Infrastructure	Operations + Infrastructure
Example	RAC	Data Guard

7. Real‑World Example (Very Important)

Scenario: Payroll System on Oracle

✅ With HA only (RAC)

DB node crashes → system survives
Storage fails → system survives
Entire DC power down → system DOWN

❌ DR needed

✅ With DR only (Data Guard)

DB node crashes → outage until restart
OS hung → outage
Whole DC lost → system recovered

❌ HA needed

✅ With HA + DR (Correct Design)

     Users
       │
Application Layer (retry & continuity)
       │
────────── Primary Site ──────────
 Oracle RAC (HA)
       │
   Sync/Async Redo
────────── DR Site ──────────
 Data Guard Standby (DR)

✅ Node failure → RAC
✅ DB crash → RAC
✅ Site failure → DG

📌 This is enterprise‑grade resilience

8. Common Misconceptions (Audit Findings)

❌ “We have RAC, so DR is not needed”
✅ RAC ≠ site failure protection

❌ “We have DR, so HA is unnecessary”
✅ DR failover is too slow for local failures

❌ “Availability % is the same as DR”
✅ Availability ≠ recoverability

9. Architectural Rule of Thumb (Remember This)

HA keeps the system running
DR brings the system back

10. Interview‑ & Review‑Ready Answer (Use This)

“High Availability addresses localized infrastructure failures within a site using technologies like Oracle RAC to provide automatic and immediate recovery. Disaster Recovery addresses catastrophic site‑level failures using geographically separated systems such as Oracle Data Guard, focusing on business continuity rather than instant recovery.”

11. One‑Line Executive Summary

HA = protect uptime
DR = protect the business