
Cloud
Learning Level
Scaling handles growing demand. Availability ensures your app stays online even when things fail.
Buy a more powerful server.
Before: 4GB RAM, 2 CPU
After: 64GB RAM, 8 CPU
Cost: 2x-3x more expensive
Problem: Eventually hits hardware limitsPros: Simple, one server to manage
Cons: Expensive, has limits, downtime to upgrade
Add more servers sharing the load.
Before: 1 server handling 1,000 req/sec
After: 4 servers handling 4,000 req/sec
Cost: Linear growth with demand
Benefit: Infinite scalabilityPros: Scalable, cost-effective, resilient
Cons: Requires load balancer, more complex
Automatically add/remove servers based on demand.
8 AM (Peak): Add servers
โ
โโ 1,000 users โ 5 servers
โโ 10,000 users โ 50 servers
โโ 100,000 users โ 500 servers
โ
6 PM (Evening): Remove servers
โโ 10,000 users โ 50 servers
โโ 1,000 users โ 5 serversHow it works:
1. Monitor CPU usage
2. If CPU > 70% for 5 minutes โ Add server
3. If CPU < 30% for 10 minutes โ Remove server
4. Automatic, no manual intervention
Savings:
Fixed servers: 50 servers ร $100/month = $5,000
Auto-scaled: Average 15 servers ร $100 = $1,500
Savings: 70%Distributes incoming requests across multiple servers.
Incoming Request
โ
Load Balancer
/ | \
/ | \
[Server1] [Server2] [Server3]Algorithm:
Example:
1,000 requests/sec
รท 10 servers
= 100 requests/sec per server
(manageable, no overload)Application stays online even when components fail.
Single Server (No HA):
Server crashes โ App down โ Customers angry
MTTR: 30 minutesMultiple Servers (HA):
Server 1 crashes โ Load balancer routes to Server 2 & 3
Application still works โ Customers don't notice
MTTR: 0 (automatic)Deploy across multiple regions worldwide.
Europe Asia Americas
โโ London โโ Tokyo โโ New York
โโ Frankfurt โโ Singapore โโ Los Angeles
โโ Amsterdam โโ Mumbai
If London data center fails โ Traffic routes to FrankfurtBenefits:
Users worldwide
โ
CloudFront (Global CDN - caches content)
โ
Multiple AWS regions
โโ US-East
โโ US-West
โโ Europe
โโ Asia-Pacific
Each region has:
โโ Load balancers
โโ Multiple app servers (auto-scaling)
โโ Multiple database replicasResult:
| Metric | What it means | Target |
|---|---|---|
| Availability | % uptime | 99.9%+ |
| MTTR | Time to recover | < 5 minutes |
| RTO | Time to restore service | < 1 hour |
| RPO | Max data loss | < 1 hour |
| Latency | Response time | < 200ms |
Want production patterns? ๐ Scaling & Availability (Experienced)
Resources
Ojasa Mirai
Master AI-powered development skills through structured learning, real projects, and verified credentials. Whether you're upskilling your team or launching your career, we deliver the skills companies actually need.
Learn Deep โข Build Real โข Verify Skills โข Launch Forward