Ojasa Mirai

Cloud

Learning Level

🔧 GCP Account Setup ⚙️ GCP Compute Overview 🚀 Cloud Run Deployment 🎯 App Engine Deployment 📁 GCP Storage & Hosting 🔥 Firebase Hosting 🗄️ Firestore Setup ⚡ Firestore Realtime 💾 Cloud SQL Setup 📊 GCP Monitoring 🔑 GCP Authentication 📈 GCP Scaling & Performance ⚡ Firebase Functions 💰 GCP Cost Optimization

Cloud/Gcp Deployment/Gcp Monitoring

📊 GCP Monitoring - Advanced

Introduction

Advanced monitoring enables sophisticated alerting, custom metrics, distributed tracing, and observability patterns for production applications.

Key Learning Outcomes

By the end of this lesson, you'll understand:

Custom metrics and time series

Alert policy automation

Distributed tracing with Cloud Trace

SLO/SLI implementation

Alerting best practices

Custom dashboards at scale

Observability patterns

Custom Metrics

Publish Custom Metrics

Node.js:

const {MetricServiceClient} = require('@google-cloud/monitoring');

class MetricsPublisher {
  constructor(projectId) {
    this.client = new MetricServiceClient();
    this.projectName = this.client.projectPath(projectId);
  }

  async publishMetric(metricType, value, labels = {}) {
    const timeSeries = {
      metric: {
        type: `custom.googleapis.com/${metricType}`,
        labels
      },
      points: [{
        interval: {
          endTime: {seconds: Math.floor(Date.now() / 1000)}
        },
        value: {doubleValue: value}
      }]
    };

    const request = {
      name: this.projectName,
      timeSeries: [timeSeries]
    };

    await this.client.createTimeSeries(request);
  }

  async recordLatency(operationName, durationMs) {
    await this.publishMetric('operation/latency_ms', durationMs, {
      operation: operationName
    });
  }

  async recordCounter(counterName, value = 1) {
    await this.publishMetric(`counter/${counterName}`, value);
  }
}

// Usage
const metrics = new MetricsPublisher('my-project');

app.get('/api/data', async (req, res) => {
  const start = Date.now();
  
  const data = await fetchData();
  
  const duration = Date.now() - start;
  await metrics.recordLatency('fetch_data', duration);
  
  res.json(data);
});

Python:

from google.cloud import monitoring_v3

class CustomMetrics:
  def __init__(self, project_id):
    self.client = monitoring_v3.MetricServiceClient()
    self.project_name = self.client.project_path(project_id)

  def write_metric(self, metric_type, value, labels):
    series = monitoring_v3.TimeSeries()
    series.metric.type = f'custom.googleapis.com/{metric_type}'
    
    for key, val in labels.items():
      series.metric.labels[key] = val
    
    point = monitoring_v3.Point()
    point.interval.end_time.GetCurrentTime()
    point.value.double_value = value
    series.points = [point]
    
    self.client.create_time_series(
      name=self.project_name,
      time_series=[series]
    )

Alerting Policies

Create Alert with Terraform

resource "google_monitoring_alert_policy" "high_error_rate" {
  display_name = "High Error Rate"
  combiner     = "OR"

  conditions {
    display_name = "Error rate > 5%"
    
    condition_threshold {
      filter = <<-EOT
        resource.type = "cloud_run_revision" AND
        metric.type = "run.googleapis.com/request_count" AND
        resource.labels.service_name = "my-service"
      EOT
      
      comparison      = "COMPARISON_GT"
      threshold_value = 0.05
      
      aggregations {
        alignment_period  = "60s"
        per_series_aligner = "ALIGN_RATE"
      }
    }
  }

  notification_channels = [google_monitoring_notification_channel.email.id]
  documentation {
    content = "High error rate detected on my-service"
  }
}

Alert Groups

class AlertManager {
  async createAlertPolicy(config) {
    const request = {
      name: this.projectName,
      alertPolicy: {
        displayName: config.name,
        combiner: 'OR',
        notificationChannels: config.notificationChannels,
        conditions: config.conditions,
        documentation: {
          content: config.documentation
        }
      }
    };

    return this.client.createAlertPolicy(request);
  }

  async groupAlerts(alertIds) {
    // Create alert group for correlation
    const groupConfig = {
      name: 'related_alerts',
      alerts: alertIds,
      correlation: 'automatic'
    };

    return this.createAlertPolicy(groupConfig);
  }
}

Distributed Tracing

const tracer = require('@opencensus/nodejs').tracer;
const {StackdriverExporter} = require('@opencensus/exporter-stackdriver');

const exporter = new StackdriverExporter({projectId: 'my-project'});
tracer.registerSpanEventListener(exporter);

app.get('/api/orders/:id', (req, res) => {
  const span = tracer.startChildSpan({name: 'getOrder'});
  
  try {
    const order = getOrder(req.params.id);
    span.addAttribute('orderId', order.id);
    span.addAttribute('amount', order.amount);
  } finally {
    span.end();
  }
  
  res.json(order);
});

SLO/SLI Implementation

class SLOCalculator {
  constructor(serviceMetrics) {
    this.metrics = serviceMetrics;
  }

  calculateSLI(startTime, endTime) {
    const successfulRequests = this.metrics.successfulRequests(startTime, endTime);
    const totalRequests = this.metrics.totalRequests(startTime, endTime);
    
    return successfulRequests / totalRequests;
  }

  calculateErrorBudget(sloTarget = 0.999) {
    const currentSLI = this.calculateSLI(
      Date.now() - 30 * 24 * 60 * 60 * 1000,
      Date.now()
    );
    
    return {
      target: sloTarget,
      actual: currentSLI,
      remaining: Math.max(0, sloTarget - currentSLI),
      burnRate: (1 - currentSLI) / (1 - sloTarget)
    };
  }
}

Key Takeaways

**Custom metrics** enable business-specific monitoring

**Alert policies** automate incident detection

**Distributed tracing** shows request flow across services

**SLOs/SLIs** quantify reliability targets

**Error budgets** manage risk vs. velocity

Next Steps

Explore advanced alerting with Incident Response, or learn about using Log Analytics for pattern detection.

Resources

Python Docs

Ojasa Mirai

Master AI-powered development skills through structured learning, real projects, and verified credentials. Whether you're upskilling your team or launching your career, we deliver the skills companies actually need.

Learn Deep • Build Real • Verify Skills • Launch Forward

Courses

Python Fastapi ReactJS Cloud

Resources

Blog & Articles GitHub Projects Video Tutorials

Ecosystem

Ojasa Mirai Site My Growth Learning Portal Community Discord

Twitter GitHub LinkedIn