Capacity Planning
Ensure you have enough resources to meet demand reliably, without overspending.
Theory
Capacity planning makes sure you can serve demand reliably while controlling cost. The process:
- Forecast demand — from growth trends, seasonality, and planned events
- Load test to find each component’s limits and per-unit capacity
- Provision with headroom — run below max so spikes and failures don’t tip you over
- Autoscale for elastic demand; reserve a buffer for the N+1 case (survive losing one unit/zone)
Key tension: too little capacity causes outages; too much wastes money. Watch saturation (one of the Golden Signals) and lead times — some resources can’t be acquired instantly.
Real-World Example
Capacity for a Black Friday spike:
Baseline: 2,000 req/s, autoscaled
Forecast: 6x peak = 12,000 req/s
Load test: one instance handles 500 req/s safely
Plan: 24 instances at peak + N+1 buffer + headroom for the unknown
Pre-scale before the event (autoscaling lag + provisioning lead time) Hands-On Exercise
- Outline the steps of capacity planning.
- Explain N+1 provisioning and why it matters.
- Why provision headroom rather than run at 100% utilization?
- How does load testing feed capacity decisions?
Cheat Sheet▾
| Step | Detail |
|---|---|
| Forecast | Growth, seasonality, events |
| Load test | Find per-unit limits |
| Headroom | Run below max |
| N+1 | Survive losing one unit |
| Autoscale | Elastic demand |
| Saturation | Golden signal to watch |
| Lead time | Some capacity isn’t instant |
Common Interview Questions▾
What is N+1 capacity planning?
Provisioning enough capacity to keep serving demand even after losing one component (instance, zone, or region) — so a single failure doesn’t cause an outage.
Why not run systems at 100% utilization?
There’s no headroom for traffic spikes, failures, or deploys — any of which would immediately overload the system. Headroom turns a spike or failure into a non-event.
Official Documentation
📝 My notes on this topic
Auto-saves as you type