Skip to content

Module 4

rfashwall edited this page Apr 23, 2026 · 3 revisions

Module 4: Golang API Gateway

What You'll Build

By the end of this module, you'll have:

  • ✅ High-performance API Gateway written in Go (~17MB vs ~900MB Python)
  • ✅ Production-ready reverse proxy with request forwarding
  • ✅ Complete observability stack (structured logging + Prometheus metrics)
  • ✅ Request ID tracking for distributed tracing
  • ✅ Graceful shutdown handling for zero-downtime deployments
  • ✅ Backend health monitoring and automatic failover
  • ✅ Polyglot microservices architecture (Go + Python)

Real-World Impact:

  • 67% reduction in memory usage vs Python-only architecture
  • Sub-millisecond request routing overhead
  • 10x faster startup times for rapid scaling
  • Professional-grade observability for production monitoring
  • Separation of concerns: Go for I/O, Python for ML

Learning Objectives

By the end of this module, you will:

  • ✅ Build a production-ready API Gateway in Go
  • ✅ Understand polyglot architecture (Go + Python)
  • ✅ Implement structured logging and metrics
  • ✅ Deploy Go services to Kubernetes
  • ✅ Monitor multi-language microservices

Part 1: Setup & Prerequisites

Why Go for the Gateway?

Aspect Go Gateway Python Alternative
Image Size ~17MB ~900MB
Memory Usage ~64MB ~200MB
Startup Time ~1 second ~5-10 seconds
CPU Efficiency High (compiled) Lower (interpreted)
Best For Routing, I/O ML, Data Science

Key Insight: Use the right tool for each job!

  • Go: Lightweight gateway/API layer
  • Python: ML inference where libraries matter

Prerequisites

  • Completed Module 3 (Kubernetes deployment)
  • Go 1.21+ installed
  • Docker and kind running
  • kubectl configured

Install Go

# macOS
brew install go

# Verify
go version
# Windows — using winget
winget install GoLang.Go

# Or download the installer from https://go.dev/dl/
# Then verify
go version

Part 2: Hands-On Exercise

Quick Start

1. Complete the Exercise

API Gateway

Goal: Build a complete API Gateway in Go with all essential features.

cd modules/module-4/starter

# Open the file
gateway.go 

# Find and fill in 20 TODOs
# Look for: // YOUR CODE HERE

# Download dependencies
go mod download

# Build and test locally (ensure ML service is running)

# macOS/Linux/WSL
export BACKEND_URL=http://localhost:3000
go run gateway.go

# Windows PowerShell
$env:BACKEND_URL = "http://localhost:3000"
go run gateway.go

# In another terminal, test it (macOS/Linux/WSL)
curl http://localhost:8080/health
curl http://localhost:8080/metrics  # Prometheus metrics
curl -X POST http://localhost:8080/predict \
     -H "Content-Type: application/json" \
     -d '{"text":"Go is fast!"}'

# In another terminal, test it (Windows PowerShell)
Invoke-RestMethod -Uri http://localhost:8080/health
Invoke-RestMethod -Uri http://localhost:8080/metrics
$body = '{"text":"Go is fast!"}'
Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body

# Quick smoke-test (macOS/Linux/WSL)
curl -fsS http://localhost:8080/health   >/dev/null && echo "health OK"
curl -fsS http://localhost:8080/metrics  >/dev/null && echo "metrics OK"
curl -fsS -X POST http://localhost:8080/predict \
     -H "Content-Type: application/json" \
     -d '{"text":"Go is fast!"}' && echo  # should print JSON response

# Quick smoke-test (Windows PowerShell)
try { Invoke-RestMethod -Uri http://localhost:8080/health | Out-Null; Write-Host "health OK" } catch { Write-Host "health FAILED" }
try { Invoke-RestMethod -Uri http://localhost:8080/metrics | Out-Null; Write-Host "metrics OK" } catch { Write-Host "metrics FAILED" }
$body = '{"text":"Go is fast!"}'; Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body

Key TODOs (20 total):

Configuration (TODO 1)

TODO 1: Return a populated Config struct from loadConfig()

// FILL IN: Return Config struct with all fields populated from env vars
return &Config{
    Port:           getEnv("GATEWAY_PORT", "8080"),
    BackendURL:     getEnv("BACKEND_URL", "http://sentiment-api-service:80"),
    RequestTimeout: 10 * time.Second,
    LogLevel:       getEnv("LOG_LEVEL", "info"),
    Environment:    getEnv("ENVIRONMENT", "production"),
}
// Hint: Replace the `return nil` — all fields use getEnv so the gateway is configurable via environment variables

Prometheus Metrics (TODOs 2-3)

TODO 2: Define the HTTP request counter

// FILL IN: promauto.NewCounterVec with method, endpoint, and status labels
httpRequestsTotal = promauto.NewCounterVec(
    prometheus.CounterOpts{
        Name: "gateway_http_requests_total",
        Help: "Total number of HTTP requests",
    },
    []string{"method", "endpoint", "status"},
)
// Hint: Replace the `= nil` — the status label lets you split success vs error counts in Grafana

TODO 3: Define the HTTP request duration histogram

// FILL IN: promauto.NewHistogramVec with method and endpoint labels
httpRequestDuration = promauto.NewHistogramVec(
    prometheus.HistogramOpts{
        Name:    "gateway_http_request_duration_seconds",
        Help:    "HTTP request duration in seconds",
        Buckets: prometheus.DefBuckets,
    },
    []string{"method", "endpoint"},
)
// Hint: Replace the `= nil` — DefBuckets gives you the standard latency buckets (5ms … 10s)

Middleware (TODOs 4-6)

TODO 4: Generate and propagate a request ID in requestIDMiddleware

// FILL IN: Reuse incoming X-Request-ID or generate a fresh UUID
requestID := r.Header.Get("X-Request-ID")
if requestID == "" {
    requestID = uuid.New().String()
}
ctx := context.WithValue(r.Context(), "request_id", requestID)
w.Header().Set("X-Request-ID", requestID)
next.ServeHTTP(w, r.WithContext(ctx))
// Hint: Storing the ID in the context lets every downstream handler retrieve it without passing it explicitly

TODO 5: Call the next handler and log the completed request in loggingMiddleware

// FILL IN: Delegate to next, then emit a structured completion log
next.ServeHTTP(rw, r)
duration := time.Since(start)
logger.Info("request completed",
    "request_id", requestID,
    "method", r.Method,
    "path", r.URL.Path,
    "status", rw.statusCode,
    "duration_ms", duration.Milliseconds(),
    "bytes_written", rw.written)
// Hint: rw is the wrapped responseWriter that captures status code and bytes — use it, not w

TODO 6: Record Prometheus metrics after each request in metricsMiddleware

// FILL IN: Increment the counter and observe the duration using the captured status code
httpRequestsTotal.WithLabelValues(r.Method, r.URL.Path, fmt.Sprintf("%d", rw.statusCode)).Inc()
httpRequestDuration.WithLabelValues(r.Method, r.URL.Path).Observe(duration)
// Hint: duration is already computed as time.Since(start).Seconds() just above — reuse it

Handlers (TODOs 7-10)

TODO 7: Create the HTTP mux and register all routes in createHandler

// FILL IN: Wire up every endpoint and wrap the mux with all three middlewares
mux := http.NewServeMux()
mux.HandleFunc("/health", g.handleHealth)
mux.HandleFunc("/predict", g.handlePredict)
mux.HandleFunc("/batch_predict", g.handleBatchPredict)
mux.Handle("/metrics", promhttp.Handler())
mux.HandleFunc("/", g.handleRoot)
return chain(mux, requestIDMiddleware, loggingMiddleware(g.logger), metricsMiddleware)
// Hint: Replace `return nil` — chain applies middlewares in order: request ID → logging → metrics

TODO 8: Reject non-GET methods in handleHealth

// FILL IN: Return 405 for any method that is not GET
if r.Method != http.MethodGet {
    http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
    return
}
// Hint: Health checks should only respond to GET; rejecting early avoids unnecessary backend probes

TODO 9: Probe the backend /health endpoint with a short timeout

// FILL IN: Create a 2-second context and fire a GET to the backend
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
req, _ := http.NewRequestWithContext(ctx, http.MethodGet, g.config.BackendURL+"/health", nil)
resp, err := g.httpClient.Do(req)
// Hint: A dedicated short timeout prevents a slow backend from blocking the gateway's own health response

TODO 10: Include the backend's reachability in the health JSON response

// FILL IN: Populate health["backend"] based on whether err is nil
if err != nil {
    health["backend"] = "unreachable"
    health["backend_error"] = err.Error()
} else {
    resp.Body.Close()
    health["backend"] = "ok"
    health["backend_status"] = resp.StatusCode
}
// Hint: Kubernetes readiness probes read this response — reporting backend failures here lets K8s route traffic away automatically

Proxy Logic (TODOs 11-16)

TODO 11: Read the incoming request body in proxyRequest

// FILL IN: Drain r.Body so you can forward the raw bytes to the backend
body, err := io.ReadAll(r.Body)
if err != nil {
    g.logger.Error("failed to read request body", "request_id", requestID, "error", err)
    http.Error(w, "Failed to read request", http.StatusBadRequest)
    return
}
defer r.Body.Close()
// Hint: Reading the body fully lets you log its size and forward it as a fresh reader to the backend

TODO 12: Build the outbound request to the backend service

// FILL IN: Construct a new request targeting the backend URL + path
backendURL := g.config.BackendURL + path
req, err := http.NewRequestWithContext(r.Context(), r.Method, backendURL, bytes.NewReader(body))
if err != nil {
    g.logger.Error("failed to create backend request", "request_id", requestID, "error", err)
    http.Error(w, "Internal server error", http.StatusInternalServerError)
    return
}
// Hint: Passing r.Context() propagates the deadline and cancellation from the original client request

TODO 13: Forward the client's headers to the backend and stamp the request ID

// FILL IN: Copy all safe headers; skip hop-by-hop headers like Host and Connection
for key, values := range r.Header {
    if key != "Host" && key != "Connection" {
        for _, value := range values {
            req.Header.Add(key, value)
        }
    }
}
req.Header.Set("X-Request-ID", requestID)
// Hint: Content-Type and Authorization must be forwarded so the backend can parse the body and authenticate

TODO 14: Send the request to the backend and record timing metrics

// FILL IN: Execute the request, measure duration, and label backend metrics
start := time.Now()
resp, err := g.httpClient.Do(req)
duration := time.Since(start).Seconds()
statusLabel := "error"
if resp != nil {
    statusLabel = fmt.Sprintf("%d", resp.StatusCode)
}
backendRequestsTotal.WithLabelValues(path, statusLabel).Inc()
backendRequestDuration.WithLabelValues(path).Observe(duration)
if err != nil {
    g.logger.Error("backend request failed", "request_id", requestID, "error", err)
    http.Error(w, "Backend unavailable", http.StatusBadGateway)
    return
}
defer resp.Body.Close()
// Hint: Record metrics even on error so dashboards show failure rate, not just silence

TODO 15: Copy all response headers from the backend to the client

// FILL IN: Forward every header the backend sets so clients see Content-Type, Cache-Control, etc.
for key, values := range resp.Header {
    for _, value := range values {
        w.Header().Add(key, value)
    }
}
// Hint: Headers must be written before WriteHeader — Go's ResponseWriter will ignore headers set afterward

TODO 16: Write the backend's status code and stream its body to the client

// FILL IN: Forward status code first, then stream the response body
w.WriteHeader(resp.StatusCode)
_, err = io.Copy(w, resp.Body)
if err != nil {
    g.logger.Error("failed to write response", "request_id", requestID, "error", err)
}
// Hint: io.Copy streams the body without loading it fully into memory — important for large batch responses

Server Setup (TODOs 17-20)

TODO 17: Create a structured JSON logger with the configured log level

// FILL IN: Map the LOG_LEVEL string to an slog.Level and build a JSON handler
var logLevel slog.Level
switch config.LogLevel {
case "debug":  logLevel = slog.LevelDebug
case "info":   logLevel = slog.LevelInfo
case "warn":   logLevel = slog.LevelWarn
case "error":  logLevel = slog.LevelError
default:       logLevel = slog.LevelInfo
}
logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{Level: logLevel}))
slog.SetDefault(logger)
// Hint: JSON output lets log aggregators (Loki, Datadog) parse fields like request_id and duration_ms automatically

TODO 18: Configure the HTTP server with production-safe timeouts

// FILL IN: Create http.Server with read, write, and idle timeouts
server := &http.Server{
    Addr:         ":" + config.Port,
    Handler:      gateway.createHandler(),
    ReadTimeout:  15 * time.Second,
    WriteTimeout: 15 * time.Second,
    IdleTimeout:  60 * time.Second,
}
// Hint: Without timeouts, slow clients can hold connections open and exhaust the goroutine pool

TODO 19: Start the server in a goroutine so the main goroutine can handle shutdown signals

// FILL IN: Launch ListenAndServe in a goroutine; treat ErrServerClosed as a clean exit
go func() {
    logger.Info("server listening", "addr", server.Addr)
    if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
        logger.Error("server failed", "error", err)
        os.Exit(1)
    }
}()
// Hint: http.ErrServerClosed is returned by ListenAndServe after Shutdown() is called — it is not an error

TODO 20: Block until a termination signal arrives, then shut down gracefully

// FILL IN: Wait for SIGINT or SIGTERM, then give in-flight requests 30 s to finish
quit := make(chan os.Signal, 1)
signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
<-quit
logger.Info("shutting down server...")
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := server.Shutdown(ctx); err != nil {
    logger.Error("server forced to shutdown", "error", err)
    os.Exit(1)
}
logger.Info("server stopped gracefully")
// Hint: Kubernetes sends SIGTERM before killing a pod — this 30-second window lets active predictions complete

2. Build Docker Image

# Return to module-4 root
cd ..

# Build Docker image
docker build -t api-gateway:v1 .

# Load into kind
kind load docker-image api-gateway:v1 --name mlops-workshop

3. Deploy to Kubernetes

# Ensure ML service is running
kubectl get svc sentiment-api-service

# Deploy gateway
kubectl apply -f deployment.yaml

# Check deployment
kubectl get all -l app=api-gateway

# Port-forward to test
kubectl port-forward svc/api-gateway-service 8080:80

# Test (macOS/Linux/WSL)
curl http://localhost:8080/health
curl -X POST http://localhost:8080/predict \
     -H "Content-Type: application/json" \
     -d '{"text":"Kubernetes + Go!"}'

# Test (Windows PowerShell)
Invoke-RestMethod -Uri http://localhost:8080/health
$body = '{"text":"Kubernetes + Go!"}'
Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body

4. Validate All Work

# macOS/Linux/WSL
curl -fsS http://localhost:8080/health   >/dev/null && echo "health OK"
curl -fsS http://localhost:8080/metrics  | grep -q http_requests_total && echo "metrics OK"
curl -fsS -X POST http://localhost:8080/predict \
     -H "Content-Type: application/json" \
     -d '{"text":"Kubernetes + Go!"}' | grep -q sentiment && echo "predict OK"
# Windows PowerShell
try { Invoke-RestMethod -Uri http://localhost:8080/health | Out-Null; Write-Host "health OK" } catch { Write-Host "health FAILED" }
$metrics = Invoke-RestMethod -Uri http://localhost:8080/metrics
if ($metrics -match "http_requests_total") { Write-Host "metrics OK" } else { Write-Host "metrics FAILED" }
$body = '{"text":"Kubernetes + Go!"}'
$result = Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body
if ($result.sentiment) { Write-Host "predict OK" } else { Write-Host "predict FAILED" }

If all three print OK, you've completed the exercise.

Part 3: Architecture & Concepts

Key Concepts Covered

Go Fundamentals

  • HTTP Server: Creating and configuring http.Server
  • Handlers: Request handling functions
  • Routing: ServeMux for URL pattern matching
  • Context: Request-scoped values

Reverse Proxy Pattern

  • Request Forwarding: Proxying requests to backend
  • Header Propagation: Copying headers between requests
  • Timeout Handling: Setting client and server timeouts
  • Error Handling: Graceful error responses

Production Features

  • Structured Logging: JSON logs with log/slog
  • Metrics: Prometheus instrumentation
  • Request Tracking: Correlation IDs (X-Request-ID)
  • Middleware: Composable request processing
  • Graceful Shutdown: Clean termination on SIGTERM

Observability

  • Logging: Structured JSON logs for parsing
  • Metrics: HTTP request counters and duration histograms
  • Tracing: Request ID propagation
  • Health Checks: Backend availability monitoring

Polyglot Architecture Benefits

Resource Efficiency

Without Gateway (Python only):

Python Service (10 replicas × 1Gi = 10Gi memory)
Cost: High

With Gateway (Go + Python):

Go Gateway (5 replicas × 64Mi = 320Mi)
    ↓
Python ML Service (3 replicas × 1Gi = 3Gi)

Total: 3.3Gi vs 10Gi = 67% reduction!

Separation of Concerns

Layer Language Responsibility Scales Based On
Gateway Go Routing, auth Request count
ML Service Python Inference CPU usage

Common Commands

# Local development
go mod download                # Download dependencies
go run gateway.go              # Run locally
go build -o gateway gateway.go # Build binary

# Docker
docker build -t api-gateway:v1 .
kind load docker-image api-gateway:v1 --name mlops-workshop

# Kubernetes
kubectl apply -f deployment.yaml
kubectl get pods -l app=api-gateway
kubectl logs -l app=api-gateway -f
kubectl port-forward svc/api-gateway-service 8080:80

Part 4: Troubleshooting

Troubleshooting

Issue 1: Build fails - Missing dependencies

Symptoms:

go build gateway.go
# Error: package prometheus is not in GOROOT

Root Cause: Dependencies not downloaded

Solutions:

Step 1: Download dependencies

# Ensure you're in the starter directory
cd modules/module-4/starter

# Download all dependencies
go mod download

# Verify modules
go mod verify

# Tidy up (remove unused)
go mod tidy

Step 2: Check Go version

go version
# Should be >= 1.21

# If too old, upgrade Go (macOS)
brew upgrade go
# Windows: download latest installer from https://go.dev/dl/

Step 3: Clear cache if corrupted

# Clean mod cache
go clean -modcache

# Re-download
go mod download

Issue 2: Build fails - Syntax errors

Symptoms:

gateway.go:45:2: syntax error: unexpected }

Root Cause: Incomplete TODO or syntax mistake

Solutions:

Check compilation:

# Build to see all errors
go build gateway.go

# Run with verbose errors
go build -v gateway.go

Use go fmt:

# Auto-format code (helps catch syntax issues)
go fmt gateway.go

# Check for common mistakes
go vet gateway.go

Issue 3: Cannot connect to backend service

Symptoms:

Error: dial tcp: lookup sentiment-api-service: no such host

Root Causes:

  1. Backend service not running
  2. Wrong service name or namespace
  3. DNS resolution issues

Solutions:

Check 1: Verify backend is running

# Check if service exists
kubectl get svc sentiment-api-service

# Check if pods are ready
kubectl get pods -l app=sentiment-api

# If not running, deploy it first (Module 3)
cd ../module-3
kubectl apply -f deployment.yaml

Check 2: Test backend connectivity

# Port-forward backend
kubectl port-forward svc/sentiment-api-service 3000:80

# Test in another terminal
curl http://localhost:3000/health

Check 3: Verify service name in gateway

# Check BACKEND_URL environment variable
kubectl get deployment api-gateway -o yaml | grep BACKEND_URL

# Should be: http://sentiment-api-service:80

Check 4: Check gateway logs

# View logs
kubectl logs -l app=api-gateway --tail=50

# Look for connection errors
kubectl logs -l app=api-gateway | grep -i "error\|fail"

Issue 4: Gateway responds with 502 Bad Gateway

Symptoms:

curl http://localhost:8080/predict
# HTTP 502 Bad Gateway

Root Cause: Backend service is down or returning errors

Solutions:

Step 1: Check backend health

# Direct backend test
kubectl port-forward svc/sentiment-api-service 3000:80
curl http://localhost:3000/health

Step 2: Check gateway logs

# View detailed logs
kubectl logs -l app=api-gateway -f

# Look for proxy errors
kubectl logs -l app=api-gateway | grep proxy

Step 3: Verify backend response

# Exec into gateway pod
kubectl exec -it <gateway-pod> -- sh

# Test from inside pod
wget -qO- http://sentiment-api-service:80/health

Still stuck? Check the solution file


Part 5: Reference

Commands Cheat Sheet

Quick Start

# Navigate to module
cd modules/module-4/starter

# Download dependencies
go mod download

# Run locally (ensure backend is running)

# macOS/Linux/WSL
export BACKEND_URL=http://localhost:3000
go run gateway.go

# Windows PowerShell
$env:BACKEND_URL = "http://localhost:3000"
go run gateway.go

# In another terminal, test (macOS/Linux/WSL)
curl http://localhost:8080/health
# In another terminal, test (Windows PowerShell)
Invoke-RestMethod -Uri http://localhost:8080/health

Docker Commands

# Build Docker image
docker build -t api-gateway:v1 .

# Build with specific tag
docker build -t api-gateway:v1.0.0 .

# Run locally (macOS/Linux/WSL)
docker run -p 8080:8080 \
  -e BACKEND_URL=http://host.docker.internal:3000 \
  api-gateway:v1
# Run locally (Windows PowerShell)
docker run -p 8080:8080 `
  -e BACKEND_URL=http://host.docker.internal:3000 `
  api-gateway:v1
# Check image size
docker images api-gateway:v1

# Inspect layers
docker history api-gateway:v1

# Load into kind
kind load docker-image api-gateway:v1 --name mlops-workshop

Kubernetes Commands

# Apply deployment
kubectl apply -f deployment.yaml

# Get all resources
kubectl get all -l app=api-gateway

# Get pods with details
kubectl get pods -l app=api-gateway -o wide

# Describe deployment
kubectl describe deployment api-gateway

# View logs
kubectl logs -l app=api-gateway

# Follow logs
kubectl logs -l app=api-gateway -f

# View logs from specific container
kubectl logs <pod-name> -c api-gateway

# Exec into pod
kubectl exec -it <pod-name> -- sh

# Port-forward
kubectl port-forward svc/api-gateway-service 8080:80

# Scale manually
kubectl scale deployment api-gateway --replicas=3

# Restart deployment
kubectl rollout restart deployment api-gateway

# Check rollout status
kubectl rollout status deployment api-gateway

# View rollout history
kubectl rollout history deployment api-gateway

Debugging Commands

# macOS/Linux/WSL

# Check if gateway is running
ps aux | grep gateway

# Check port usage
lsof -i :8080

# Test locally
curl http://localhost:8080/health
curl http://localhost:8080/metrics

# Test with verbose output
curl -v http://localhost:8080/health

# Test request ID
curl -H "X-Request-ID: test-123" http://localhost:8080/health
# Windows PowerShell

# Check if gateway is running
Get-Process | Where-Object { $_.Name -like "*gateway*" }

# Check port usage
netstat -ano | findstr :8080

# Test locally
Invoke-RestMethod -Uri http://localhost:8080/health
Invoke-RestMethod -Uri http://localhost:8080/metrics

# Test with verbose output
Invoke-WebRequest -Uri http://localhost:8080/health -Verbose

# Test request ID
Invoke-RestMethod -Uri http://localhost:8080/health -Headers @{"X-Request-ID"="test-123"}
# Cross-platform (Go tools)

# Check Go environment
go env

# Check dependencies
go list -m all

# Why is package included?
go mod why github.com/prometheus/client_golang

# Dependency graph
go mod graph | grep prometheus

Metrics and Monitoring

# macOS/Linux/WSL

# View Prometheus metrics
curl http://localhost:8080/metrics

# Filter specific metrics
curl http://localhost:8080/metrics | grep http_requests_total

# Parse metrics
curl -s http://localhost:8080/metrics | \
  grep http_requests_total | \
  grep method

# Watch metrics change
watch -n 1 'curl -s http://localhost:8080/metrics | grep http_requests_total'
# Windows PowerShell

# View Prometheus metrics
Invoke-RestMethod -Uri http://localhost:8080/metrics

# Filter specific metrics
Invoke-RestMethod -Uri http://localhost:8080/metrics | Select-String "http_requests_total"

# Watch metrics change (refresh every second)
while ($true) {
    Invoke-RestMethod -Uri http://localhost:8080/metrics | Select-String "http_requests_total"
    Start-Sleep -Seconds 1
}

Performance Testing

# macOS/Linux/WSL — install hey and run load tests
go install github.com/rakyll/hey@latest

hey -n 1000 -c 10 http://localhost:8080/health

hey -n 1000 -c 10 -m POST \
    -H "Content-Type: application/json" \
    -d '{"text":"load test"}' \
    http://localhost:8080/predict

hey -z 30s -c 20 http://localhost:8080/health

# Save results
hey -n 1000 -c 10 http://localhost:8080/health > results.txt

# Using Apache Bench
ab -n 1000 -c 10 http://localhost:8080/health
# Windows PowerShell — install hey then fix PATH
go install github.com/rakyll/hey@latest

# Add Go bin to PATH for the current session if hey isn't found
$env:PATH += ";$env:USERPROFILE\go\bin"

# Verify
hey --version

# Basic load test
hey -n 1000 -c 10 http://localhost:8080/health

# POST load test
hey -n 1000 -c 10 -m POST -H "Content-Type: application/json" -d '{"text":"load test"}' http://localhost:8080/predict
# Windows PowerShell — alternative without hey (built-in load test)
$url = "http://localhost:8080/health"
$requests = 100
$start = Get-Date
1..$requests | ForEach-Object {
    Invoke-RestMethod -Uri $url | Out-Null
}
$elapsed = (Get-Date) - $start
Write-Host "$requests requests in $($elapsed.TotalSeconds)s ($([math]::Round($requests / $elapsed.TotalSeconds, 1)) req/s)"

Cleanup Commands

# macOS/Linux/WSL

# Stop running gateway
pkill gateway

# Remove built binaries
rm gateway gateway-linux
# Windows PowerShell

# Stop running gateway
Stop-Process -Name gateway -Force -ErrorAction SilentlyContinue

# Remove built binaries
Remove-Item -Force gateway.exe, gateway-linux -ErrorAction SilentlyContinue
# Cross-platform

# Delete Kubernetes resources
kubectl delete -f deployment.yaml

# Delete by label
kubectl delete all -l app=api-gateway

# Remove Docker image
docker rmi api-gateway:v1

# Clean Go cache
go clean -cache
go clean -modcache

Solution File

If you get stuck, a complete reference implementation is available in solution/:

  • gateway_solution.go - All TODOs completed with detailed comments

Note: Try to complete the exercise on your own first!

Next Steps

Once you've completed the exercise and tests pass:

Module 5: Kubeflow Pipelines

In Module 5, you'll build ML pipelines for training automation!

Key Takeaways

What We Learned

  • Go HTTP Server: Building production HTTP services
  • Reverse Proxy: Forwarding requests to backend services
  • Structured Logging: JSON logs for observability
  • Metrics: Prometheus instrumentation
  • Middleware: Composable request processing
  • Graceful Shutdown: Clean termination
  • Polyglot Architecture: Right tool for each job

Best Practices

  • Use Go for I/O-bound, routing, and gateway layers
  • Use Python for ML inference and data science
  • Always implement structured logging
  • Instrument code with Prometheus metrics
  • Propagate request IDs for tracing
  • Handle graceful shutdown
  • Set appropriate timeouts
  • Test backend connectivity in health checks

Real-World Production

This configuration is suitable for production ML APIs that need:

  • High Performance: Low latency, high throughput
  • Resource Efficiency: Minimal memory footprint
  • Observability: Structured logs and metrics
  • Reliability: Health checks and graceful shutdown
  • Scalability: Independent scaling of gateway and ML service

Having issues? Check the Troubleshooting section or review the solution file!


Navigation

Previous Home Next
Module 3: Kubernetes Deployment 🏠 Home Module 5: Kubeflow Pipelines & Model Serving

Quick Links


MLOps Workshop | GitHub Repository

Clone this wiki locally