✅ What is RPO (Recovery Point Objective)?
RPO = How much data loss is acceptable?
It defines how far back in time you must recover your database after a failure.
In other words:
RPO tells you how much data you can afford to lose.
It's measured in time (seconds, minutes, hours).
📌 Database Example
Suppose:
- Your database takes backups every 1 hour
- A failure happens at 3:45 PM
- Last backup was at 3:00 PM
Then:
- You lose 45 minutes of data
- So your RPO = 1 hour
If your business says:
- “We cannot lose more than 5 minutes of data”
Then:
- You must implement near real-time replication, e.g.,
- PostgreSQL sync replication
- SQL Server AlwaysOn synchronous commit
- Oracle Data Guard synchronous
- MySQL Group Replication
✅ What is RTO (Recovery Time Objective)?
RTO = How much time is acceptable to restore service?
It defines how quickly your database must be back online after a failure.
In other words:
RTO tells you how long you can afford your database to be down.
📌 Database Example
Suppose:
- Your database fails at 3:45 PM
- You restore from backup + perform recovery
- Everything is back online at 4:30 PM
Then:
- RTO = 45 minutes
If your business says:
- Database must be back within 5 minutes
Then you need:
- Automated failover
- Multi‑AZ synchronous replica
- Warm standby instance already running
- No manual restore
🎯 Putting Both Together (Database Scenario)
Scenario:
Your production PostgreSQL database crashes at 3:45 PM
- Last WAL archive was at 3:40 PM → RPO = 5 minutes
- Failover to standby completes at 3:47 PM → RTO = 2 minutes
This means:
- You lost 5 minutes of data (acceptable based on RPO)
- System was down for 2 minutes (acceptable based on RTO)
🧩 Easy Analogy
| Term | Meaning (Simple) | Database Interpretation |
|---|---|---|
| RPO | How much data you can lose | Gap between last usable data & failure time |
| RTO | How long you can be down | Time database takes to become operational |
🔥 Real-World DB Examples You Can Use
1. Single‑AZ Database
- Backups every night
- No replication
- RPO = 24 hours (you lose 1 day of data)
- RTO = many hours (need to restore backup)
2. Multi‑AZ Synchronous Replication
- Data committed on both nodes
- Failover is automatic
- RPO ≈ 0 seconds
- RTO = 30–120 seconds
3. Multi‑Region Asynchronous Replication
- Slight replication lag (5–15 seconds)
- RPO = a few seconds
- RTO = a few minutes
⭐ Summary (Very Simple)
- RPO = How much data can I lose?
- RTO = How long can I be down?
Both are business-driven, implemented through database architecture.
No comments:
Post a Comment