Thursday, January 8, 2026

Interview Question 17 : Type of SCN ?

1. Redo SCN vs Checkpoint SCN (hidden control mechanism)

Oracle tracks multiple SCNs internally, but two matter most here:

Redo SCN : Highest SCN whose redo is safely on disk

Checkpoint SCN : Lowest SCN for which all dirty buffers are written

Absolute rule

Checkpoint SCN ≤ Redo SCN

👉 Oracle never lets DBWR advance the checkpoint SCN beyond the redo SCN.

Why this matters

DBWR can write dirty blocks
But only blocks whose redo SCN ≤ redo flushed SCN
If not → LGWR is forced first

This is the real internal enforcement of WAL.

2. Commit path: why LGWR is synchronous and DBWR is not

What happens on COMMIT (timeline)

Client session
   |
   |-- generate COMMIT redo record
   |
   |-- post LGWR
           |
           |-- write redo buffer to redo log
           |
           |-- fsync / IO completion
   |
   |<- commit success returned

Key point:

Commit waits only for LGWR
Commit does NOT wait for DBWR

Why DBWR is excluded

If commit waited for DBWR:

Random I/O
Unpredictable latency
Massive scalability collapse

Oracle instead guarantees:

“If redo exists, we can always reproduce the data later”

3. Dirty buffer protection (the invisible gate)

Every dirty buffer in the buffer cache carries metadata:

Buffer SCN
Required redo SCN

Before DBWR writes a block:

DBWR checks required redo SCN
If redo is not on disk:
- DBWR signals LGWR
- DBWR waits
Only after redo flush completes:
- DBWR writes the block

This mechanism is sometimes called a:

Redo write barrier

It is completely internal — DBAs only see effects, not the barrier itself.

4. Checkpoints: who really coordinates LGWR vs DBWR

Checkpoint is not just DBWR writing blocks.

During checkpoint:

CKPT:
- Signals LGWR to flush redo up to checkpoint SCN
- Signals DBWR to write dirty buffers up to same SCN
LGWR always goes first
DBWR follows

Sequence:

CKPT
 ├─ LGWR → redo flush
 ├─ DBWR → dirty buffers
 └─ CKPT → datafile headers updated

Why datafile headers last:

They record checkpoint SCN
Must reflect redo that is already durable

5. RAC: same rule, more writers

In RAC, the rule becomes even stricter.

Extra components:

Redo threads per instance
LMS / GCS (global cache service)
Shared disk

Rule still holds:

No instance can let DBWR write a block unless redo for that block is durable in that instance’s redo thread.

Additional complexity:

Block shipping between instances
Commit SCN must be globally consistent
LGWR on own instance must flush before block can be shared as clean

This is why:

Log latency kills RAC performance
DBWR waits often hide behind global cache waits

6. What happens when LGWR is slow

Symptom 1: `log file sync` waits

Users waiting on commit
LGWR is bottleneck

Symptom 2: DBWR stalls

Dirty buffers accumulate
Buffer cache pressure increases
“Free buffer waits” may appear

Why DBWR suffers

Because:

DBWR cannot bypass LGWR
Data blocks are hostage to redo durability

Root causes usually are:

Slow redo disks
Small redo logs
Excessive commits
Misconfigured commit batching

7. Why Oracle chose redo-first instead of data-first

Let’s compare alternatives.

Option A: Write data first ❌

Breaks crash recovery
Corrupt datafiles possible
Unrecoverable inconsistencies

Option B: Write redo first ✅

Sequential I/O
Small writes
Perfect recovery model
Scales across CPU cores

This design decision is why Oracle can commit millions of TPS without writing datafiles synchronously.

8. Recovery proves LGWR’s primacy

After crash:

Oracle mounts database
Reads datafiles (possibly stale)
Reads redo logs
Applies redo → roll forward
Applies undo → roll back

Notice:

Datafiles are not trusted
Redo logs are trusted

That trust only exists because LGWR always wrote first.

9. Short mental model (architect‑level)

Redo = truth
Datafiles = cache
LGWR = durability gatekeeper
DBWR = background janitor

10. One‑line summary (perfect for interviews)

LGWR must write before DBWR because Oracle’s crash recovery depends on redo being the authoritative source of committed changes; DBWR is explicitly prevented from writing data blocks unless corresponding redo is already durable, enforcing Write‑Ahead Logging.

ORACLE DATABASE PROBLEM AND SOLUTIONS

Thursday, January 8, 2026

Interview Question 17 : Type of SCN ?

1. Redo SCN vs Checkpoint SCN (hidden control mechanism)

Absolute rule

Why this matters

2. Commit path: why LGWR is synchronous and DBWR is not

What happens on COMMIT (timeline)

Why DBWR is excluded

3. Dirty buffer protection (the invisible gate)

4. Checkpoints: who really coordinates LGWR vs DBWR

During checkpoint:

5. RAC: same rule, more writers

Extra components:

Rule still holds:

6. What happens when LGWR is slow

Symptom 1: `log file sync` waits

Symptom 2: DBWR stalls

Why DBWR suffers

Root causes usually are:

7. Why Oracle chose redo-first instead of data-first

Option A: Write data first ❌

Option B: Write redo first ✅

8. Recovery proves LGWR’s primacy

9. Short mental model (architect‑level)

10. One‑line summary (perfect for interviews)

No comments:

Post a Comment

Oracle AI Database Free support Platforms and Connectivity

Thursday, January 8, 2026

Interview Question 17 : Type of SCN ?

1. Redo SCN vs Checkpoint SCN (hidden control mechanism)

Absolute rule

Why this matters

2. Commit path: why LGWR is synchronous and DBWR is not

What happens on COMMIT (timeline)

Why DBWR is excluded

3. Dirty buffer protection (the invisible gate)

4. Checkpoints: who really coordinates LGWR vs DBWR

During checkpoint:

5. RAC: same rule, more writers

Extra components:

Rule still holds:

6. What happens when LGWR is slow

Symptom 1: log file sync waits

Symptom 2: DBWR stalls

Why DBWR suffers

Root causes usually are:

7. Why Oracle chose redo-first instead of data-first

Option A: Write data first ❌

Option B: Write redo first ✅

8. Recovery proves LGWR’s primacy

9. Short mental model (architect‑level)

10. One‑line summary (perfect for interviews)

No comments:

Post a Comment

Oracle AI Database Free support Platforms and Connectivity

Symptom 1: `log file sync` waits