Ojasa Mirai

Ojasa Mirai

Cloud

Loading...

Learning Level

🟢 Beginner🔵 Advanced
🔧 GCP Account Setup⚙️ GCP Compute Overview🚀 Cloud Run Deployment🎯 App Engine Deployment📁 GCP Storage & Hosting🔥 Firebase Hosting🗄️ Firestore Setup⚡ Firestore Realtime💾 Cloud SQL Setup📊 GCP Monitoring🔑 GCP Authentication📈 GCP Scaling & Performance⚡ Firebase Functions💰 GCP Cost Optimization
Cloud/Gcp Deployment/Gcp Scaling Performance

📈 GCP Scaling & Performance

Introduction

Google Cloud provides automatic scaling capabilities that adjust resources based on demand. Understanding scaling patterns and optimization techniques ensures your applications remain responsive and cost-effective under varying load.

Key Learning Outcomes

By the end of this lesson, you'll understand:

  • Horizontal and vertical scaling
  • Automatic scaling policies
  • Load balancing strategies
  • Caching and CDN
  • Database query optimization
  • Performance monitoring
  • Cost-efficiency in scaling

Horizontal vs. Vertical Scaling

TypeDetails
HorizontalAdd more instances (Recommended)
VerticalMake instances larger (Limited)

Horizontal Scaling Example

# Create instance group with autoscaling
gcloud compute instance-groups managed create web-server-group \
  --base-instance-name=web-server \
  --template=web-server-template \
  --size=2 \
  --zone=us-central1-a

# Set autoscaling policy
gcloud compute instance-groups managed set-autoscaling web-server-group \
  --max-num-replicas=10 \
  --min-num-replicas=2 \
  --target-cpu-utilization=0.65 \
  --zone=us-central1-a

Cloud Run Auto-Scaling

Configure Scaling

gcloud run deploy my-service \
  --image gcr.io/my-project/my-image \
  --max-instances=100 \
  --min-instances=1 \
  --memory 512Mi \
  --cpu 1

Custom scaling with Dockerfile:

FROM node:18-alpine

WORKDIR /app
COPY . .

# Multi-stage build for smaller image
FROM node:18-alpine
COPY --from=0 /app/dist ./dist
COPY --from=0 /app/node_modules ./node_modules

EXPOSE 3000
CMD ["node", "dist/server.js"]

Load Balancing

Set Up HTTP(S) Load Balancer

# Create health check
gcloud compute health-checks create http http-check \
  --request-path=/health \
  --port=3000

# Create backend service
gcloud compute backend-services create my-backend \
  --health-checks=http-check \
  --protocol=HTTP \
  --enable-cdn

# Add instances to backend
gcloud compute backend-services add-backends my-backend \
  --instance-group=web-server-group \
  --zone=us-central1-a

# Create URL map
gcloud compute url-maps create my-lb \
  --default-service=my-backend

# Create HTTP proxy
gcloud compute target-http-proxies create my-proxy \
  --url-map=my-lb

# Create forwarding rule
gcloud compute forwarding-rules create my-fw-rule \
  --global \
  --target-http-proxy=my-proxy \
  --address=my-static-ip \
  --ports=80

Caching Strategies

Application-Level Caching

Node.js with Redis:

const redis = require('redis');
const client = redis.createClient({
  host: 'redis-server',
  port: 6379
});

async function getCachedUser(userId) {
  const cacheKey = `user:${userId}`;
  
  // Try cache first
  const cached = await client.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }
  
  // Fetch from database
  const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
  
  // Store in cache (expire after 1 hour)
  await client.setex(cacheKey, 3600, JSON.stringify(user));
  
  return user;
}

HTTP Caching Headers

App configuration:

app.use((req, res, next) => {
  // Cache static assets for 1 year
  if (req.url.startsWith('/static/')) {
    res.set('Cache-Control', 'public, max-age=31536000, immutable');
  }
  // Cache API responses for 5 minutes
  else if (req.url.startsWith('/api/')) {
    res.set('Cache-Control', 'public, max-age=300');
  }
  // Don't cache HTML
  else {
    res.set('Cache-Control', 'public, max-age=0, must-revalidate');
  }
  next();
});

Database Optimization

Connection Pooling

Node.js:

const pool = mysql.createPool({
  connectionLimit: 10,
  waitForConnections: true,
  queueLimit: 0,
  host: process.env.DB_HOST,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  database: process.env.DB_NAME
});

Query Optimization

// Bad: N+1 query problem
const users = await db.query('SELECT * FROM users');
for (const user of users) {
  const posts = await db.query('SELECT * FROM posts WHERE user_id = ?', [user.id]);
  user.posts = posts;
}

// Good: Single query with JOIN
const users = await db.query(`
  SELECT users.*, posts.*
  FROM users
  LEFT JOIN posts ON users.id = posts.user_id
`);

Performance Monitoring

View Metrics

# CPU utilization
gcloud monitoring read-time-series \
  --filter='metric.type=compute.googleapis.com/instance/cpu/utilization'

# Network I/O
gcloud monitoring read-time-series \
  --filter='metric.type=compute.googleapis.com/instance/network/received_bytes_count'

Best Practices

  • **Set appropriate limits** on autoscaling to prevent cost overruns
  • **Use caching** to reduce database load
  • **Optimize queries** with proper indexes
  • **Use connection pooling** in database clients
  • **Monitor metrics** to identify bottlenecks
  • **Test under load** to verify scaling behavior

Key Takeaways

  • **Horizontal scaling** distributes load across multiple instances
  • **Automatic scaling** adjusts resources based on metrics
  • **Load balancing** distributes traffic evenly
  • **Caching** reduces latency and database load
  • **CDN** serves content from locations closer to users
  • **Monitoring** identifies performance issues early

Next Steps

Learn about Cloud Trace for detailed performance analysis, or explore optimization techniques for your specific application.


Resources

Python Docs

Ojasa Mirai

Master AI-powered development skills through structured learning, real projects, and verified credentials. Whether you're upskilling your team or launching your career, we deliver the skills companies actually need.

Learn Deep • Build Real • Verify Skills • Launch Forward

Courses

PythonFastapiReactJSCloud

© 2026 Ojasa Mirai. All rights reserved.

TwitterGitHubLinkedIn