Module 4

Module 4: Golang API Gateway

What You'll Build

By the end of this module, you'll have:

✅ High-performance API Gateway written in Go (~17MB vs ~900MB Python)
✅ Production-ready reverse proxy with request forwarding
✅ Complete observability stack (structured logging + Prometheus metrics)
✅ Request ID tracking for distributed tracing
✅ Graceful shutdown handling for zero-downtime deployments
✅ Backend health monitoring and automatic failover
✅ Polyglot microservices architecture (Go + Python)

Real-World Impact:

67% reduction in memory usage vs Python-only architecture
Sub-millisecond request routing overhead
10x faster startup times for rapid scaling
Professional-grade observability for production monitoring
Separation of concerns: Go for I/O, Python for ML

Learning Objectives

By the end of this module, you will:

✅ Build a production-ready API Gateway in Go
✅ Understand polyglot architecture (Go + Python)
✅ Implement structured logging and metrics
✅ Deploy Go services to Kubernetes
✅ Monitor multi-language microservices

Part 1: Setup & Prerequisites

Why Go for the Gateway?

Aspect	Go Gateway	Python Alternative
Image Size	~17MB	~900MB
Memory Usage	~64MB	~200MB
Startup Time	~1 second	~5-10 seconds
CPU Efficiency	High (compiled)	Lower (interpreted)
Best For	Routing, I/O	ML, Data Science

Key Insight: Use the right tool for each job!

Go: Lightweight gateway/API layer
Python: ML inference where libraries matter

Prerequisites

Completed Module 3 (Kubernetes deployment)
Go 1.21+ installed
Docker and kind running
kubectl configured

Install Go

# macOS
brew install go

# Verify
go version

# Windows — using winget
winget install GoLang.Go

# Or download the installer from https://go.dev/dl/
# Then verify
go version

Part 2: Hands-On Exercise

Quick Start

1. Complete the Exercise

API Gateway

Goal: Build a complete API Gateway in Go with all essential features.

cd modules/module-4/starter

# Open the file
gateway.go 

# Find and fill in 20 TODOs
# Look for: // YOUR CODE HERE

# Download dependencies
go mod download

# Build and test locally (ensure ML service is running)

# macOS/Linux/WSL
export BACKEND_URL=http://localhost:3000
go run gateway.go

# Windows PowerShell
$env:BACKEND_URL = "http://localhost:3000"
go run gateway.go

# In another terminal, test it (macOS/Linux/WSL)
curl http://localhost:8080/health
curl http://localhost:8080/metrics  # Prometheus metrics
curl -X POST http://localhost:8080/predict \
     -H "Content-Type: application/json" \
     -d '{"text":"Go is fast!"}'

# In another terminal, test it (Windows PowerShell)
Invoke-RestMethod -Uri http://localhost:8080/health
Invoke-RestMethod -Uri http://localhost:8080/metrics
$body = '{"text":"Go is fast!"}'
Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body

# Quick smoke-test (macOS/Linux/WSL)
curl -fsS http://localhost:8080/health   >/dev/null && echo "health OK"
curl -fsS http://localhost:8080/metrics  >/dev/null && echo "metrics OK"
curl -fsS -X POST http://localhost:8080/predict \
     -H "Content-Type: application/json" \
     -d '{"text":"Go is fast!"}' && echo  # should print JSON response

# Quick smoke-test (Windows PowerShell)
try { Invoke-RestMethod -Uri http://localhost:8080/health | Out-Null; Write-Host "health OK" } catch { Write-Host "health FAILED" }
try { Invoke-RestMethod -Uri http://localhost:8080/metrics | Out-Null; Write-Host "metrics OK" } catch { Write-Host "metrics FAILED" }
$body = '{"text":"Go is fast!"}'; Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body

Key TODOs (20 total):

Configuration (TODO 1)

TODO 1: Return a populated Config struct from loadConfig()

// FILL IN: Return Config struct with all fields populated from env vars
return &Config{
    Port:           getEnv("GATEWAY_PORT", "8080"),
    BackendURL:     getEnv("BACKEND_URL", "http://sentiment-api-service:80"),
    RequestTimeout: 10 * time.Second,
    LogLevel:       getEnv("LOG_LEVEL", "info"),
    Environment:    getEnv("ENVIRONMENT", "production"),
}
// Hint: Replace the `return nil` — all fields use getEnv so the gateway is configurable via environment variables

Prometheus Metrics (TODOs 2-3)

TODO 2: Define the HTTP request counter

// FILL IN: promauto.NewCounterVec with method, endpoint, and status labels
httpRequestsTotal = promauto.NewCounterVec(
    prometheus.CounterOpts{
        Name: "gateway_http_requests_total",
        Help: "Total number of HTTP requests",
    },
    []string{"method", "endpoint", "status"},
)
// Hint: Replace the `= nil` — the status label lets you split success vs error counts in Grafana

TODO 3: Define the HTTP request duration histogram

// FILL IN: promauto.NewHistogramVec with method and endpoint labels
httpRequestDuration = promauto.NewHistogramVec(
    prometheus.HistogramOpts{
        Name:    "gateway_http_request_duration_seconds",
        Help:    "HTTP request duration in seconds",
        Buckets: prometheus.DefBuckets,
    },
    []string{"method", "endpoint"},
)
// Hint: Replace the `= nil` — DefBuckets gives you the standard latency buckets (5ms … 10s)

Middleware (TODOs 4-6)

TODO 4: Generate and propagate a request ID in requestIDMiddleware

// FILL IN: Reuse incoming X-Request-ID or generate a fresh UUID
requestID := r.Header.Get("X-Request-ID")
if requestID == "" {
    requestID = uuid.New().String()
}
ctx := context.WithValue(r.Context(), "request_id", requestID)
w.Header().Set("X-Request-ID", requestID)
next.ServeHTTP(w, r.WithContext(ctx))
// Hint: Storing the ID in the context lets every downstream handler retrieve it without passing it explicitly

TODO 5: Call the next handler and log the completed request in loggingMiddleware

// FILL IN: Delegate to next, then emit a structured completion log
next.ServeHTTP(rw, r)
duration := time.Since(start)
logger.Info("request completed",
    "request_id", requestID,
    "method", r.Method,
    "path", r.URL.Path,
    "status", rw.statusCode,
    "duration_ms", duration.Milliseconds(),
    "bytes_written", rw.written)
// Hint: rw is the wrapped responseWriter that captures status code and bytes — use it, not w

TODO 6: Record Prometheus metrics after each request in metricsMiddleware

// FILL IN: Increment the counter and observe the duration using the captured status code
httpRequestsTotal.WithLabelValues(r.Method, r.URL.Path, fmt.Sprintf("%d", rw.statusCode)).Inc()
httpRequestDuration.WithLabelValues(r.Method, r.URL.Path).Observe(duration)
// Hint: duration is already computed as time.Since(start).Seconds() just above — reuse it

Handlers (TODOs 7-10)

TODO 7: Create the HTTP mux and register all routes in createHandler

// FILL IN: Wire up every endpoint and wrap the mux with all three middlewares
mux := http.NewServeMux()
mux.HandleFunc("/health", g.handleHealth)
mux.HandleFunc("/predict", g.handlePredict)
mux.HandleFunc("/batch_predict", g.handleBatchPredict)
mux.Handle("/metrics", promhttp.Handler())
mux.HandleFunc("/", g.handleRoot)
return chain(mux, requestIDMiddleware, loggingMiddleware(g.logger), metricsMiddleware)
// Hint: Replace `return nil` — chain applies middlewares in order: request ID → logging → metrics

TODO 8: Reject non-GET methods in handleHealth

// FILL IN: Return 405 for any method that is not GET
if r.Method != http.MethodGet {
    http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
    return
}
// Hint: Health checks should only respond to GET; rejecting early avoids unnecessary backend probes

TODO 9: Probe the backend /health endpoint with a short timeout

// FILL IN: Create a 2-second context and fire a GET to the backend
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
req, _ := http.NewRequestWithContext(ctx, http.MethodGet, g.config.BackendURL+"/health", nil)
resp, err := g.httpClient.Do(req)
// Hint: A dedicated short timeout prevents a slow backend from blocking the gateway's own health response

TODO 10: Include the backend's reachability in the health JSON response

// FILL IN: Populate health["backend"] based on whether err is nil
if err != nil {
    health["backend"] = "unreachable"
    health["backend_error"] = err.Error()
} else {
    resp.Body.Close()
    health["backend"] = "ok"
    health["backend_status"] = resp.StatusCode
}
// Hint: Kubernetes readiness probes read this response — reporting backend failures here lets K8s route traffic away automatically

Proxy Logic (TODOs 11-16)

TODO 11: Read the incoming request body in proxyRequest

// FILL IN: Drain r.Body so you can forward the raw bytes to the backend
body, err := io.ReadAll(r.Body)
if err != nil {
    g.logger.Error("failed to read request body", "request_id", requestID, "error", err)
    http.Error(w, "Failed to read request", http.StatusBadRequest)
    return
}
defer r.Body.Close()
// Hint: Reading the body fully lets you log its size and forward it as a fresh reader to the backend

TODO 12: Build the outbound request to the backend service

// FILL IN: Construct a new request targeting the backend URL + path
backendURL := g.config.BackendURL + path
req, err := http.NewRequestWithContext(r.Context(), r.Method, backendURL, bytes.NewReader(body))
if err != nil {
    g.logger.Error("failed to create backend request", "request_id", requestID, "error", err)
    http.Error(w, "Internal server error", http.StatusInternalServerError)
    return
}
// Hint: Passing r.Context() propagates the deadline and cancellation from the original client request

TODO 13: Forward the client's headers to the backend and stamp the request ID

// FILL IN: Copy all safe headers; skip hop-by-hop headers like Host and Connection
for key, values := range r.Header {
    if key != "Host" && key != "Connection" {
        for _, value := range values {
            req.Header.Add(key, value)
        }
    }
}
req.Header.Set("X-Request-ID", requestID)
// Hint: Content-Type and Authorization must be forwarded so the backend can parse the body and authenticate

TODO 14: Send the request to the backend and record timing metrics

// FILL IN: Execute the request, measure duration, and label backend metrics
start := time.Now()
resp, err := g.httpClient.Do(req)
duration := time.Since(start).Seconds()
statusLabel := "error"
if resp != nil {
    statusLabel = fmt.Sprintf("%d", resp.StatusCode)
}
backendRequestsTotal.WithLabelValues(path, statusLabel).Inc()
backendRequestDuration.WithLabelValues(path).Observe(duration)
if err != nil {
    g.logger.Error("backend request failed", "request_id", requestID, "error", err)
    http.Error(w, "Backend unavailable", http.StatusBadGateway)
    return
}
defer resp.Body.Close()
// Hint: Record metrics even on error so dashboards show failure rate, not just silence

TODO 15: Copy all response headers from the backend to the client

// FILL IN: Forward every header the backend sets so clients see Content-Type, Cache-Control, etc.
for key, values := range resp.Header {
    for _, value := range values {
        w.Header().Add(key, value)
    }
}
// Hint: Headers must be written before WriteHeader — Go's ResponseWriter will ignore headers set afterward

TODO 16: Write the backend's status code and stream its body to the client

// FILL IN: Forward status code first, then stream the response body
w.WriteHeader(resp.StatusCode)
_, err = io.Copy(w, resp.Body)
if err != nil {
    g.logger.Error("failed to write response", "request_id", requestID, "error", err)
}
// Hint: io.Copy streams the body without loading it fully into memory — important for large batch responses

Server Setup (TODOs 17-20)

TODO 17: Create a structured JSON logger with the configured log level

// FILL IN: Map the LOG_LEVEL string to an slog.Level and build a JSON handler
var logLevel slog.Level
switch config.LogLevel {
case "debug":  logLevel = slog.LevelDebug
case "info":   logLevel = slog.LevelInfo
case "warn":   logLevel = slog.LevelWarn
case "error":  logLevel = slog.LevelError
default:       logLevel = slog.LevelInfo
}
logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{Level: logLevel}))
slog.SetDefault(logger)
// Hint: JSON output lets log aggregators (Loki, Datadog) parse fields like request_id and duration_ms automatically

TODO 18: Configure the HTTP server with production-safe timeouts

// FILL IN: Create http.Server with read, write, and idle timeouts
server := &http.Server{
    Addr:         ":" + config.Port,
    Handler:      gateway.createHandler(),
    ReadTimeout:  15 * time.Second,
    WriteTimeout: 15 * time.Second,
    IdleTimeout:  60 * time.Second,
}
// Hint: Without timeouts, slow clients can hold connections open and exhaust the goroutine pool

TODO 19: Start the server in a goroutine so the main goroutine can handle shutdown signals

// FILL IN: Launch ListenAndServe in a goroutine; treat ErrServerClosed as a clean exit
go func() {
    logger.Info("server listening", "addr", server.Addr)
    if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
        logger.Error("server failed", "error", err)
        os.Exit(1)
    }
}()
// Hint: http.ErrServerClosed is returned by ListenAndServe after Shutdown() is called — it is not an error

TODO 20: Block until a termination signal arrives, then shut down gracefully

// FILL IN: Wait for SIGINT or SIGTERM, then give in-flight requests 30 s to finish
quit := make(chan os.Signal, 1)
signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
<-quit
logger.Info("shutting down server...")
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := server.Shutdown(ctx); err != nil {
    logger.Error("server forced to shutdown", "error", err)
    os.Exit(1)
}
logger.Info("server stopped gracefully")
// Hint: Kubernetes sends SIGTERM before killing a pod — this 30-second window lets active predictions complete

2. Build Docker Image

# Return to module-4 root
cd ..

# Build Docker image
docker build -t api-gateway:v1 .

# Load into kind
kind load docker-image api-gateway:v1 --name mlops-workshop

3. Deploy to Kubernetes

# Ensure ML service is running
kubectl get svc sentiment-api-service

# Deploy gateway
kubectl apply -f deployment.yaml

# Check deployment
kubectl get all -l app=api-gateway

# Port-forward to test
kubectl port-forward svc/api-gateway-service 8080:80

# Test (macOS/Linux/WSL)
curl http://localhost:8080/health
curl -X POST http://localhost:8080/predict \
     -H "Content-Type: application/json" \
     -d '{"text":"Kubernetes + Go!"}'

# Test (Windows PowerShell)
Invoke-RestMethod -Uri http://localhost:8080/health
$body = '{"text":"Kubernetes + Go!"}'
Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body

4. Validate All Work

# macOS/Linux/WSL
curl -fsS http://localhost:8080/health   >/dev/null && echo "health OK"
curl -fsS http://localhost:8080/metrics  | grep -q http_requests_total && echo "metrics OK"
curl -fsS -X POST http://localhost:8080/predict \
     -H "Content-Type: application/json" \
     -d '{"text":"Kubernetes + Go!"}' | grep -q sentiment && echo "predict OK"

# Windows PowerShell
try { Invoke-RestMethod -Uri http://localhost:8080/health | Out-Null; Write-Host "health OK" } catch { Write-Host "health FAILED" }
$metrics = Invoke-RestMethod -Uri http://localhost:8080/metrics
if ($metrics -match "http_requests_total") { Write-Host "metrics OK" } else { Write-Host "metrics FAILED" }
$body = '{"text":"Kubernetes + Go!"}'
$result = Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body
if ($result.sentiment) { Write-Host "predict OK" } else { Write-Host "predict FAILED" }

If all three print OK, you've completed the exercise.

Part 3: Architecture & Concepts

Key Concepts Covered

Go Fundamentals

HTTP Server: Creating and configuring http.Server
Handlers: Request handling functions
Routing: ServeMux for URL pattern matching
Context: Request-scoped values

Reverse Proxy Pattern

Request Forwarding: Proxying requests to backend
Header Propagation: Copying headers between requests
Timeout Handling: Setting client and server timeouts
Error Handling: Graceful error responses

Production Features

Structured Logging: JSON logs with log/slog
Metrics: Prometheus instrumentation
Request Tracking: Correlation IDs (X-Request-ID)
Middleware: Composable request processing
Graceful Shutdown: Clean termination on SIGTERM

Observability

Logging: Structured JSON logs for parsing
Metrics: HTTP request counters and duration histograms
Tracing: Request ID propagation
Health Checks: Backend availability monitoring

Polyglot Architecture Benefits

Resource Efficiency

Without Gateway (Python only):

Python Service (10 replicas × 1Gi = 10Gi memory)
Cost: High

With Gateway (Go + Python):

Go Gateway (5 replicas × 64Mi = 320Mi)
    ↓
Python ML Service (3 replicas × 1Gi = 3Gi)

Total: 3.3Gi vs 10Gi = 67% reduction!

Separation of Concerns

Layer	Language	Responsibility	Scales Based On
Gateway	Go	Routing, auth	Request count
ML Service	Python	Inference	CPU usage

Common Commands

# Local development
go mod download                # Download dependencies
go run gateway.go              # Run locally
go build -o gateway gateway.go # Build binary

# Docker
docker build -t api-gateway:v1 .
kind load docker-image api-gateway:v1 --name mlops-workshop

# Kubernetes
kubectl apply -f deployment.yaml
kubectl get pods -l app=api-gateway
kubectl logs -l app=api-gateway -f
kubectl port-forward svc/api-gateway-service 8080:80

Part 4: Troubleshooting

Troubleshooting

Issue 1: Build fails - Missing dependencies

Symptoms:

go build gateway.go
# Error: package prometheus is not in GOROOT

Root Cause: Dependencies not downloaded

Solutions:

Step 1: Download dependencies

# Ensure you're in the starter directory
cd modules/module-4/starter

# Download all dependencies
go mod download

# Verify modules
go mod verify

# Tidy up (remove unused)
go mod tidy

Step 2: Check Go version

go version
# Should be >= 1.21

# If too old, upgrade Go (macOS)
brew upgrade go
# Windows: download latest installer from https://go.dev/dl/

Step 3: Clear cache if corrupted

# Clean mod cache
go clean -modcache

# Re-download
go mod download

Issue 2: Build fails - Syntax errors

Symptoms:

gateway.go:45:2: syntax error: unexpected }

Root Cause: Incomplete TODO or syntax mistake

Solutions:

Check compilation:

# Build to see all errors
go build gateway.go

# Run with verbose errors
go build -v gateway.go

Use go fmt:

# Auto-format code (helps catch syntax issues)
go fmt gateway.go

# Check for common mistakes
go vet gateway.go

Issue 3: Cannot connect to backend service

Symptoms:

Error: dial tcp: lookup sentiment-api-service: no such host

Root Causes:

Backend service not running
Wrong service name or namespace
DNS resolution issues

Solutions:

Check 1: Verify backend is running

# Check if service exists
kubectl get svc sentiment-api-service

# Check if pods are ready
kubectl get pods -l app=sentiment-api

# If not running, deploy it first (Module 3)
cd ../module-3
kubectl apply -f deployment.yaml

Check 2: Test backend connectivity

# Port-forward backend
kubectl port-forward svc/sentiment-api-service 3000:80

# Test in another terminal
curl http://localhost:3000/health

Check 3: Verify service name in gateway

# Check BACKEND_URL environment variable
kubectl get deployment api-gateway -o yaml | grep BACKEND_URL

# Should be: http://sentiment-api-service:80

Check 4: Check gateway logs

# View logs
kubectl logs -l app=api-gateway --tail=50

# Look for connection errors
kubectl logs -l app=api-gateway | grep -i "error\|fail"

Issue 4: Gateway responds with 502 Bad Gateway

Symptoms:

curl http://localhost:8080/predict
# HTTP 502 Bad Gateway

Root Cause: Backend service is down or returning errors

Solutions:

Step 1: Check backend health

# Direct backend test
kubectl port-forward svc/sentiment-api-service 3000:80
curl http://localhost:3000/health

Step 2: Check gateway logs

# View detailed logs
kubectl logs -l app=api-gateway -f

# Look for proxy errors
kubectl logs -l app=api-gateway | grep proxy

Step 3: Verify backend response

# Exec into gateway pod
kubectl exec -it <gateway-pod> -- sh

# Test from inside pod
wget -qO- http://sentiment-api-service:80/health

Still stuck? Check the solution file

Part 5: Reference

Commands Cheat Sheet

Quick Start

# Navigate to module
cd modules/module-4/starter

# Download dependencies
go mod download

# Run locally (ensure backend is running)

# macOS/Linux/WSL
export BACKEND_URL=http://localhost:3000
go run gateway.go

# Windows PowerShell
$env:BACKEND_URL = "http://localhost:3000"
go run gateway.go

# In another terminal, test (macOS/Linux/WSL)
curl http://localhost:8080/health

# In another terminal, test (Windows PowerShell)
Invoke-RestMethod -Uri http://localhost:8080/health

Docker Commands

# Build Docker image
docker build -t api-gateway:v1 .

# Build with specific tag
docker build -t api-gateway:v1.0.0 .

# Run locally (macOS/Linux/WSL)
docker run -p 8080:8080 \
  -e BACKEND_URL=http://host.docker.internal:3000 \
  api-gateway:v1

# Run locally (Windows PowerShell)
docker run -p 8080:8080 `
  -e BACKEND_URL=http://host.docker.internal:3000 `
  api-gateway:v1

# Check image size
docker images api-gateway:v1

# Inspect layers
docker history api-gateway:v1

# Load into kind
kind load docker-image api-gateway:v1 --name mlops-workshop

Kubernetes Commands

# Apply deployment
kubectl apply -f deployment.yaml

# Get all resources
kubectl get all -l app=api-gateway

# Get pods with details
kubectl get pods -l app=api-gateway -o wide

# Describe deployment
kubectl describe deployment api-gateway

# View logs
kubectl logs -l app=api-gateway

# Follow logs
kubectl logs -l app=api-gateway -f

# View logs from specific container
kubectl logs <pod-name> -c api-gateway

# Exec into pod
kubectl exec -it <pod-name> -- sh

# Port-forward
kubectl port-forward svc/api-gateway-service 8080:80

# Scale manually
kubectl scale deployment api-gateway --replicas=3

# Restart deployment
kubectl rollout restart deployment api-gateway

# Check rollout status
kubectl rollout status deployment api-gateway

# View rollout history
kubectl rollout history deployment api-gateway

Debugging Commands

# macOS/Linux/WSL

# Check if gateway is running
ps aux | grep gateway

# Check port usage
lsof -i :8080

# Test locally
curl http://localhost:8080/health
curl http://localhost:8080/metrics

# Test with verbose output
curl -v http://localhost:8080/health

# Test request ID
curl -H "X-Request-ID: test-123" http://localhost:8080/health

# Windows PowerShell

# Check if gateway is running
Get-Process | Where-Object { $_.Name -like "*gateway*" }

# Check port usage
netstat -ano | findstr :8080

# Test locally
Invoke-RestMethod -Uri http://localhost:8080/health
Invoke-RestMethod -Uri http://localhost:8080/metrics

# Test with verbose output
Invoke-WebRequest -Uri http://localhost:8080/health -Verbose

# Test request ID
Invoke-RestMethod -Uri http://localhost:8080/health -Headers @{"X-Request-ID"="test-123"}

# Cross-platform (Go tools)

# Check Go environment
go env

# Check dependencies
go list -m all

# Why is package included?
go mod why github.com/prometheus/client_golang

# Dependency graph
go mod graph | grep prometheus

Metrics and Monitoring

# macOS/Linux/WSL

# View Prometheus metrics
curl http://localhost:8080/metrics

# Filter specific metrics
curl http://localhost:8080/metrics | grep http_requests_total

# Parse metrics
curl -s http://localhost:8080/metrics | \
  grep http_requests_total | \
  grep method

# Watch metrics change
watch -n 1 'curl -s http://localhost:8080/metrics | grep http_requests_total'

# Windows PowerShell

# View Prometheus metrics
Invoke-RestMethod -Uri http://localhost:8080/metrics

# Filter specific metrics
Invoke-RestMethod -Uri http://localhost:8080/metrics | Select-String "http_requests_total"

# Watch metrics change (refresh every second)
while ($true) {
    Invoke-RestMethod -Uri http://localhost:8080/metrics | Select-String "http_requests_total"
    Start-Sleep -Seconds 1
}

Performance Testing

# macOS/Linux/WSL — install hey and run load tests
go install github.com/rakyll/hey@latest

hey -n 1000 -c 10 http://localhost:8080/health

hey -n 1000 -c 10 -m POST \
    -H "Content-Type: application/json" \
    -d '{"text":"load test"}' \
    http://localhost:8080/predict

hey -z 30s -c 20 http://localhost:8080/health

# Save results
hey -n 1000 -c 10 http://localhost:8080/health > results.txt

# Using Apache Bench
ab -n 1000 -c 10 http://localhost:8080/health

# Windows PowerShell — install hey then fix PATH
go install github.com/rakyll/hey@latest

# Add Go bin to PATH for the current session if hey isn't found
$env:PATH += ";$env:USERPROFILE\go\bin"

# Verify
hey --version

# Basic load test
hey -n 1000 -c 10 http://localhost:8080/health

# POST load test
hey -n 1000 -c 10 -m POST -H "Content-Type: application/json" -d '{"text":"load test"}' http://localhost:8080/predict

# Windows PowerShell — alternative without hey (built-in load test)
$url = "http://localhost:8080/health"
$requests = 100
$start = Get-Date
1..$requests | ForEach-Object {
    Invoke-RestMethod -Uri $url | Out-Null
}
$elapsed = (Get-Date) - $start
Write-Host "$requests requests in $($elapsed.TotalSeconds)s ($([math]::Round($requests / $elapsed.TotalSeconds, 1)) req/s)"

Cleanup Commands

# macOS/Linux/WSL

# Stop running gateway
pkill gateway

# Remove built binaries
rm gateway gateway-linux

# Windows PowerShell

# Stop running gateway
Stop-Process -Name gateway -Force -ErrorAction SilentlyContinue

# Remove built binaries
Remove-Item -Force gateway.exe, gateway-linux -ErrorAction SilentlyContinue

# Cross-platform

# Delete Kubernetes resources
kubectl delete -f deployment.yaml

# Delete by label
kubectl delete all -l app=api-gateway

# Remove Docker image
docker rmi api-gateway:v1

# Clean Go cache
go clean -cache
go clean -modcache

Solution File

If you get stuck, a complete reference implementation is available in solution/:

gateway_solution.go - All TODOs completed with detailed comments

Note: Try to complete the exercise on your own first!

Next Steps

Once you've completed the exercise and tests pass:

→ Module 5: Kubeflow Pipelines

In Module 5, you'll build ML pipelines for training automation!

Key Takeaways

What We Learned

✅ Go HTTP Server: Building production HTTP services
✅ Reverse Proxy: Forwarding requests to backend services
✅ Structured Logging: JSON logs for observability
✅ Metrics: Prometheus instrumentation
✅ Middleware: Composable request processing
✅ Graceful Shutdown: Clean termination
✅ Polyglot Architecture: Right tool for each job

Best Practices

Use Go for I/O-bound, routing, and gateway layers
Use Python for ML inference and data science
Always implement structured logging
Instrument code with Prometheus metrics
Propagate request IDs for tracing
Handle graceful shutdown
Set appropriate timeouts
Test backend connectivity in health checks

Real-World Production

This configuration is suitable for production ML APIs that need:

High Performance: Low latency, high throughput
Resource Efficiency: Minimal memory footprint
Observability: Structured logs and metrics
Reliability: Health checks and graceful shutdown
Scalability: Independent scaling of gateway and ML service

Having issues? Check the Troubleshooting section or review the solution file!

Navigation

Previous	Home	Next
← Module 3: Kubernetes Deployment	🏠 Home	Module 5: Kubeflow Pipelines & Model Serving →

Quick Links

MLOps Workshop | GitHub Repository

Module 4

Module 4: Golang API Gateway

What You'll Build

Learning Objectives

Part 1: Setup & Prerequisites

Why Go for the Gateway?

Prerequisites

Install Go

Part 2: Hands-On Exercise

Quick Start

1. Complete the Exercise

API Gateway

Configuration (TODO 1)

Prometheus Metrics (TODOs 2-3)

Middleware (TODOs 4-6)

Handlers (TODOs 7-10)

Proxy Logic (TODOs 11-16)

Server Setup (TODOs 17-20)

2. Build Docker Image

3. Deploy to Kubernetes

4. Validate All Work

If all three print OK, you've completed the exercise.

Part 3: Architecture & Concepts

Key Concepts Covered

Go Fundamentals

Reverse Proxy Pattern

Production Features

Observability

Polyglot Architecture Benefits

Resource Efficiency

Separation of Concerns

Common Commands

Part 4: Troubleshooting

Troubleshooting

Issue 1: Build fails - Missing dependencies

Issue 2: Build fails - Syntax errors

Issue 3: Cannot connect to backend service

Issue 4: Gateway responds with 502 Bad Gateway

Part 5: Reference

Commands Cheat Sheet

Quick Start

Docker Commands

Kubernetes Commands

Debugging Commands

Metrics and Monitoring

Performance Testing

Cleanup Commands

Solution File

Next Steps

Key Takeaways

What We Learned

Best Practices

Real-World Production

Navigation

Quick Links

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally