-
Notifications
You must be signed in to change notification settings - Fork 42
Module 4
By the end of this module, you'll have:
- ✅ High-performance API Gateway written in Go (~17MB vs ~900MB Python)
- ✅ Production-ready reverse proxy with request forwarding
- ✅ Complete observability stack (structured logging + Prometheus metrics)
- ✅ Request ID tracking for distributed tracing
- ✅ Graceful shutdown handling for zero-downtime deployments
- ✅ Backend health monitoring and automatic failover
- ✅ Polyglot microservices architecture (Go + Python)
Real-World Impact:
- 67% reduction in memory usage vs Python-only architecture
- Sub-millisecond request routing overhead
- 10x faster startup times for rapid scaling
- Professional-grade observability for production monitoring
- Separation of concerns: Go for I/O, Python for ML
By the end of this module, you will:
- ✅ Build a production-ready API Gateway in Go
- ✅ Understand polyglot architecture (Go + Python)
- ✅ Implement structured logging and metrics
- ✅ Deploy Go services to Kubernetes
- ✅ Monitor multi-language microservices
| Aspect | Go Gateway | Python Alternative |
|---|---|---|
| Image Size | ~17MB | ~900MB |
| Memory Usage | ~64MB | ~200MB |
| Startup Time | ~1 second | ~5-10 seconds |
| CPU Efficiency | High (compiled) | Lower (interpreted) |
| Best For | Routing, I/O | ML, Data Science |
Key Insight: Use the right tool for each job!
- Go: Lightweight gateway/API layer
- Python: ML inference where libraries matter
- Completed Module 3 (Kubernetes deployment)
- Go 1.21+ installed
- Docker and kind running
- kubectl configured
# macOS
brew install go
# Verify
go version# Windows — using winget
winget install GoLang.Go
# Or download the installer from https://go.dev/dl/
# Then verify
go versionGoal: Build a complete API Gateway in Go with all essential features.
cd modules/module-4/starter
# Open the file
gateway.go
# Find and fill in 20 TODOs
# Look for: // YOUR CODE HERE
# Download dependencies
go mod download
# Build and test locally (ensure ML service is running)
# macOS/Linux/WSL
export BACKEND_URL=http://localhost:3000
go run gateway.go
# Windows PowerShell
$env:BACKEND_URL = "http://localhost:3000"
go run gateway.go
# In another terminal, test it (macOS/Linux/WSL)
curl http://localhost:8080/health
curl http://localhost:8080/metrics # Prometheus metrics
curl -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-d '{"text":"Go is fast!"}'
# In another terminal, test it (Windows PowerShell)
Invoke-RestMethod -Uri http://localhost:8080/health
Invoke-RestMethod -Uri http://localhost:8080/metrics
$body = '{"text":"Go is fast!"}'
Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body
# Quick smoke-test (macOS/Linux/WSL)
curl -fsS http://localhost:8080/health >/dev/null && echo "health OK"
curl -fsS http://localhost:8080/metrics >/dev/null && echo "metrics OK"
curl -fsS -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-d '{"text":"Go is fast!"}' && echo # should print JSON response
# Quick smoke-test (Windows PowerShell)
try { Invoke-RestMethod -Uri http://localhost:8080/health | Out-Null; Write-Host "health OK" } catch { Write-Host "health FAILED" }
try { Invoke-RestMethod -Uri http://localhost:8080/metrics | Out-Null; Write-Host "metrics OK" } catch { Write-Host "metrics FAILED" }
$body = '{"text":"Go is fast!"}'; Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $bodyKey TODOs (20 total):
TODO 1: Return a populated Config struct from loadConfig()
// FILL IN: Return Config struct with all fields populated from env vars
return &Config{
Port: getEnv("GATEWAY_PORT", "8080"),
BackendURL: getEnv("BACKEND_URL", "http://sentiment-api-service:80"),
RequestTimeout: 10 * time.Second,
LogLevel: getEnv("LOG_LEVEL", "info"),
Environment: getEnv("ENVIRONMENT", "production"),
}
// Hint: Replace the `return nil` — all fields use getEnv so the gateway is configurable via environment variablesTODO 2: Define the HTTP request counter
// FILL IN: promauto.NewCounterVec with method, endpoint, and status labels
httpRequestsTotal = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "gateway_http_requests_total",
Help: "Total number of HTTP requests",
},
[]string{"method", "endpoint", "status"},
)
// Hint: Replace the `= nil` — the status label lets you split success vs error counts in GrafanaTODO 3: Define the HTTP request duration histogram
// FILL IN: promauto.NewHistogramVec with method and endpoint labels
httpRequestDuration = promauto.NewHistogramVec(
prometheus.HistogramOpts{
Name: "gateway_http_request_duration_seconds",
Help: "HTTP request duration in seconds",
Buckets: prometheus.DefBuckets,
},
[]string{"method", "endpoint"},
)
// Hint: Replace the `= nil` — DefBuckets gives you the standard latency buckets (5ms … 10s)TODO 4: Generate and propagate a request ID in requestIDMiddleware
// FILL IN: Reuse incoming X-Request-ID or generate a fresh UUID
requestID := r.Header.Get("X-Request-ID")
if requestID == "" {
requestID = uuid.New().String()
}
ctx := context.WithValue(r.Context(), "request_id", requestID)
w.Header().Set("X-Request-ID", requestID)
next.ServeHTTP(w, r.WithContext(ctx))
// Hint: Storing the ID in the context lets every downstream handler retrieve it without passing it explicitlyTODO 5: Call the next handler and log the completed request in loggingMiddleware
// FILL IN: Delegate to next, then emit a structured completion log
next.ServeHTTP(rw, r)
duration := time.Since(start)
logger.Info("request completed",
"request_id", requestID,
"method", r.Method,
"path", r.URL.Path,
"status", rw.statusCode,
"duration_ms", duration.Milliseconds(),
"bytes_written", rw.written)
// Hint: rw is the wrapped responseWriter that captures status code and bytes — use it, not wTODO 6: Record Prometheus metrics after each request in metricsMiddleware
// FILL IN: Increment the counter and observe the duration using the captured status code
httpRequestsTotal.WithLabelValues(r.Method, r.URL.Path, fmt.Sprintf("%d", rw.statusCode)).Inc()
httpRequestDuration.WithLabelValues(r.Method, r.URL.Path).Observe(duration)
// Hint: duration is already computed as time.Since(start).Seconds() just above — reuse itTODO 7: Create the HTTP mux and register all routes in createHandler
// FILL IN: Wire up every endpoint and wrap the mux with all three middlewares
mux := http.NewServeMux()
mux.HandleFunc("/health", g.handleHealth)
mux.HandleFunc("/predict", g.handlePredict)
mux.HandleFunc("/batch_predict", g.handleBatchPredict)
mux.Handle("/metrics", promhttp.Handler())
mux.HandleFunc("/", g.handleRoot)
return chain(mux, requestIDMiddleware, loggingMiddleware(g.logger), metricsMiddleware)
// Hint: Replace `return nil` — chain applies middlewares in order: request ID → logging → metricsTODO 8: Reject non-GET methods in handleHealth
// FILL IN: Return 405 for any method that is not GET
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
// Hint: Health checks should only respond to GET; rejecting early avoids unnecessary backend probesTODO 9: Probe the backend /health endpoint with a short timeout
// FILL IN: Create a 2-second context and fire a GET to the backend
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
req, _ := http.NewRequestWithContext(ctx, http.MethodGet, g.config.BackendURL+"/health", nil)
resp, err := g.httpClient.Do(req)
// Hint: A dedicated short timeout prevents a slow backend from blocking the gateway's own health responseTODO 10: Include the backend's reachability in the health JSON response
// FILL IN: Populate health["backend"] based on whether err is nil
if err != nil {
health["backend"] = "unreachable"
health["backend_error"] = err.Error()
} else {
resp.Body.Close()
health["backend"] = "ok"
health["backend_status"] = resp.StatusCode
}
// Hint: Kubernetes readiness probes read this response — reporting backend failures here lets K8s route traffic away automaticallyTODO 11: Read the incoming request body in proxyRequest
// FILL IN: Drain r.Body so you can forward the raw bytes to the backend
body, err := io.ReadAll(r.Body)
if err != nil {
g.logger.Error("failed to read request body", "request_id", requestID, "error", err)
http.Error(w, "Failed to read request", http.StatusBadRequest)
return
}
defer r.Body.Close()
// Hint: Reading the body fully lets you log its size and forward it as a fresh reader to the backendTODO 12: Build the outbound request to the backend service
// FILL IN: Construct a new request targeting the backend URL + path
backendURL := g.config.BackendURL + path
req, err := http.NewRequestWithContext(r.Context(), r.Method, backendURL, bytes.NewReader(body))
if err != nil {
g.logger.Error("failed to create backend request", "request_id", requestID, "error", err)
http.Error(w, "Internal server error", http.StatusInternalServerError)
return
}
// Hint: Passing r.Context() propagates the deadline and cancellation from the original client requestTODO 13: Forward the client's headers to the backend and stamp the request ID
// FILL IN: Copy all safe headers; skip hop-by-hop headers like Host and Connection
for key, values := range r.Header {
if key != "Host" && key != "Connection" {
for _, value := range values {
req.Header.Add(key, value)
}
}
}
req.Header.Set("X-Request-ID", requestID)
// Hint: Content-Type and Authorization must be forwarded so the backend can parse the body and authenticateTODO 14: Send the request to the backend and record timing metrics
// FILL IN: Execute the request, measure duration, and label backend metrics
start := time.Now()
resp, err := g.httpClient.Do(req)
duration := time.Since(start).Seconds()
statusLabel := "error"
if resp != nil {
statusLabel = fmt.Sprintf("%d", resp.StatusCode)
}
backendRequestsTotal.WithLabelValues(path, statusLabel).Inc()
backendRequestDuration.WithLabelValues(path).Observe(duration)
if err != nil {
g.logger.Error("backend request failed", "request_id", requestID, "error", err)
http.Error(w, "Backend unavailable", http.StatusBadGateway)
return
}
defer resp.Body.Close()
// Hint: Record metrics even on error so dashboards show failure rate, not just silenceTODO 15: Copy all response headers from the backend to the client
// FILL IN: Forward every header the backend sets so clients see Content-Type, Cache-Control, etc.
for key, values := range resp.Header {
for _, value := range values {
w.Header().Add(key, value)
}
}
// Hint: Headers must be written before WriteHeader — Go's ResponseWriter will ignore headers set afterwardTODO 16: Write the backend's status code and stream its body to the client
// FILL IN: Forward status code first, then stream the response body
w.WriteHeader(resp.StatusCode)
_, err = io.Copy(w, resp.Body)
if err != nil {
g.logger.Error("failed to write response", "request_id", requestID, "error", err)
}
// Hint: io.Copy streams the body without loading it fully into memory — important for large batch responsesTODO 17: Create a structured JSON logger with the configured log level
// FILL IN: Map the LOG_LEVEL string to an slog.Level and build a JSON handler
var logLevel slog.Level
switch config.LogLevel {
case "debug": logLevel = slog.LevelDebug
case "info": logLevel = slog.LevelInfo
case "warn": logLevel = slog.LevelWarn
case "error": logLevel = slog.LevelError
default: logLevel = slog.LevelInfo
}
logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{Level: logLevel}))
slog.SetDefault(logger)
// Hint: JSON output lets log aggregators (Loki, Datadog) parse fields like request_id and duration_ms automaticallyTODO 18: Configure the HTTP server with production-safe timeouts
// FILL IN: Create http.Server with read, write, and idle timeouts
server := &http.Server{
Addr: ":" + config.Port,
Handler: gateway.createHandler(),
ReadTimeout: 15 * time.Second,
WriteTimeout: 15 * time.Second,
IdleTimeout: 60 * time.Second,
}
// Hint: Without timeouts, slow clients can hold connections open and exhaust the goroutine poolTODO 19: Start the server in a goroutine so the main goroutine can handle shutdown signals
// FILL IN: Launch ListenAndServe in a goroutine; treat ErrServerClosed as a clean exit
go func() {
logger.Info("server listening", "addr", server.Addr)
if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
logger.Error("server failed", "error", err)
os.Exit(1)
}
}()
// Hint: http.ErrServerClosed is returned by ListenAndServe after Shutdown() is called — it is not an errorTODO 20: Block until a termination signal arrives, then shut down gracefully
// FILL IN: Wait for SIGINT or SIGTERM, then give in-flight requests 30 s to finish
quit := make(chan os.Signal, 1)
signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
<-quit
logger.Info("shutting down server...")
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := server.Shutdown(ctx); err != nil {
logger.Error("server forced to shutdown", "error", err)
os.Exit(1)
}
logger.Info("server stopped gracefully")
// Hint: Kubernetes sends SIGTERM before killing a pod — this 30-second window lets active predictions complete# Return to module-4 root
cd ..
# Build Docker image
docker build -t api-gateway:v1 .
# Load into kind
kind load docker-image api-gateway:v1 --name mlops-workshop# Ensure ML service is running
kubectl get svc sentiment-api-service
# Deploy gateway
kubectl apply -f deployment.yaml
# Check deployment
kubectl get all -l app=api-gateway
# Port-forward to test
kubectl port-forward svc/api-gateway-service 8080:80
# Test (macOS/Linux/WSL)
curl http://localhost:8080/health
curl -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-d '{"text":"Kubernetes + Go!"}'
# Test (Windows PowerShell)
Invoke-RestMethod -Uri http://localhost:8080/health
$body = '{"text":"Kubernetes + Go!"}'
Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body# macOS/Linux/WSL
curl -fsS http://localhost:8080/health >/dev/null && echo "health OK"
curl -fsS http://localhost:8080/metrics | grep -q http_requests_total && echo "metrics OK"
curl -fsS -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-d '{"text":"Kubernetes + Go!"}' | grep -q sentiment && echo "predict OK"# Windows PowerShell
try { Invoke-RestMethod -Uri http://localhost:8080/health | Out-Null; Write-Host "health OK" } catch { Write-Host "health FAILED" }
$metrics = Invoke-RestMethod -Uri http://localhost:8080/metrics
if ($metrics -match "http_requests_total") { Write-Host "metrics OK" } else { Write-Host "metrics FAILED" }
$body = '{"text":"Kubernetes + Go!"}'
$result = Invoke-RestMethod -Method Post -Uri http://localhost:8080/predict -ContentType "application/json" -Body $body
if ($result.sentiment) { Write-Host "predict OK" } else { Write-Host "predict FAILED" }- HTTP Server: Creating and configuring http.Server
- Handlers: Request handling functions
- Routing: ServeMux for URL pattern matching
- Context: Request-scoped values
- Request Forwarding: Proxying requests to backend
- Header Propagation: Copying headers between requests
- Timeout Handling: Setting client and server timeouts
- Error Handling: Graceful error responses
- Structured Logging: JSON logs with log/slog
- Metrics: Prometheus instrumentation
- Request Tracking: Correlation IDs (X-Request-ID)
- Middleware: Composable request processing
- Graceful Shutdown: Clean termination on SIGTERM
- Logging: Structured JSON logs for parsing
- Metrics: HTTP request counters and duration histograms
- Tracing: Request ID propagation
- Health Checks: Backend availability monitoring
Without Gateway (Python only):
Python Service (10 replicas × 1Gi = 10Gi memory)
Cost: High
With Gateway (Go + Python):
Go Gateway (5 replicas × 64Mi = 320Mi)
↓
Python ML Service (3 replicas × 1Gi = 3Gi)
Total: 3.3Gi vs 10Gi = 67% reduction!
| Layer | Language | Responsibility | Scales Based On |
|---|---|---|---|
| Gateway | Go | Routing, auth | Request count |
| ML Service | Python | Inference | CPU usage |
# Local development
go mod download # Download dependencies
go run gateway.go # Run locally
go build -o gateway gateway.go # Build binary
# Docker
docker build -t api-gateway:v1 .
kind load docker-image api-gateway:v1 --name mlops-workshop
# Kubernetes
kubectl apply -f deployment.yaml
kubectl get pods -l app=api-gateway
kubectl logs -l app=api-gateway -f
kubectl port-forward svc/api-gateway-service 8080:80Symptoms:
go build gateway.go
# Error: package prometheus is not in GOROOT
Root Cause: Dependencies not downloaded
Solutions:
Step 1: Download dependencies
# Ensure you're in the starter directory
cd modules/module-4/starter
# Download all dependencies
go mod download
# Verify modules
go mod verify
# Tidy up (remove unused)
go mod tidyStep 2: Check Go version
go version
# Should be >= 1.21
# If too old, upgrade Go (macOS)
brew upgrade go
# Windows: download latest installer from https://go.dev/dl/Step 3: Clear cache if corrupted
# Clean mod cache
go clean -modcache
# Re-download
go mod downloadSymptoms:
gateway.go:45:2: syntax error: unexpected }
Root Cause: Incomplete TODO or syntax mistake
Solutions:
Check compilation:
# Build to see all errors
go build gateway.go
# Run with verbose errors
go build -v gateway.goUse go fmt:
# Auto-format code (helps catch syntax issues)
go fmt gateway.go
# Check for common mistakes
go vet gateway.goSymptoms:
Error: dial tcp: lookup sentiment-api-service: no such host
Root Causes:
- Backend service not running
- Wrong service name or namespace
- DNS resolution issues
Solutions:
Check 1: Verify backend is running
# Check if service exists
kubectl get svc sentiment-api-service
# Check if pods are ready
kubectl get pods -l app=sentiment-api
# If not running, deploy it first (Module 3)
cd ../module-3
kubectl apply -f deployment.yamlCheck 2: Test backend connectivity
# Port-forward backend
kubectl port-forward svc/sentiment-api-service 3000:80
# Test in another terminal
curl http://localhost:3000/healthCheck 3: Verify service name in gateway
# Check BACKEND_URL environment variable
kubectl get deployment api-gateway -o yaml | grep BACKEND_URL
# Should be: http://sentiment-api-service:80Check 4: Check gateway logs
# View logs
kubectl logs -l app=api-gateway --tail=50
# Look for connection errors
kubectl logs -l app=api-gateway | grep -i "error\|fail"Symptoms:
curl http://localhost:8080/predict
# HTTP 502 Bad Gateway
Root Cause: Backend service is down or returning errors
Solutions:
Step 1: Check backend health
# Direct backend test
kubectl port-forward svc/sentiment-api-service 3000:80
curl http://localhost:3000/healthStep 2: Check gateway logs
# View detailed logs
kubectl logs -l app=api-gateway -f
# Look for proxy errors
kubectl logs -l app=api-gateway | grep proxyStep 3: Verify backend response
# Exec into gateway pod
kubectl exec -it <gateway-pod> -- sh
# Test from inside pod
wget -qO- http://sentiment-api-service:80/healthStill stuck? Check the solution file
# Navigate to module
cd modules/module-4/starter
# Download dependencies
go mod download
# Run locally (ensure backend is running)
# macOS/Linux/WSL
export BACKEND_URL=http://localhost:3000
go run gateway.go
# Windows PowerShell
$env:BACKEND_URL = "http://localhost:3000"
go run gateway.go
# In another terminal, test (macOS/Linux/WSL)
curl http://localhost:8080/health# In another terminal, test (Windows PowerShell)
Invoke-RestMethod -Uri http://localhost:8080/health# Build Docker image
docker build -t api-gateway:v1 .
# Build with specific tag
docker build -t api-gateway:v1.0.0 .
# Run locally (macOS/Linux/WSL)
docker run -p 8080:8080 \
-e BACKEND_URL=http://host.docker.internal:3000 \
api-gateway:v1# Run locally (Windows PowerShell)
docker run -p 8080:8080 `
-e BACKEND_URL=http://host.docker.internal:3000 `
api-gateway:v1# Check image size
docker images api-gateway:v1
# Inspect layers
docker history api-gateway:v1
# Load into kind
kind load docker-image api-gateway:v1 --name mlops-workshop# Apply deployment
kubectl apply -f deployment.yaml
# Get all resources
kubectl get all -l app=api-gateway
# Get pods with details
kubectl get pods -l app=api-gateway -o wide
# Describe deployment
kubectl describe deployment api-gateway
# View logs
kubectl logs -l app=api-gateway
# Follow logs
kubectl logs -l app=api-gateway -f
# View logs from specific container
kubectl logs <pod-name> -c api-gateway
# Exec into pod
kubectl exec -it <pod-name> -- sh
# Port-forward
kubectl port-forward svc/api-gateway-service 8080:80
# Scale manually
kubectl scale deployment api-gateway --replicas=3
# Restart deployment
kubectl rollout restart deployment api-gateway
# Check rollout status
kubectl rollout status deployment api-gateway
# View rollout history
kubectl rollout history deployment api-gateway# macOS/Linux/WSL
# Check if gateway is running
ps aux | grep gateway
# Check port usage
lsof -i :8080
# Test locally
curl http://localhost:8080/health
curl http://localhost:8080/metrics
# Test with verbose output
curl -v http://localhost:8080/health
# Test request ID
curl -H "X-Request-ID: test-123" http://localhost:8080/health# Windows PowerShell
# Check if gateway is running
Get-Process | Where-Object { $_.Name -like "*gateway*" }
# Check port usage
netstat -ano | findstr :8080
# Test locally
Invoke-RestMethod -Uri http://localhost:8080/health
Invoke-RestMethod -Uri http://localhost:8080/metrics
# Test with verbose output
Invoke-WebRequest -Uri http://localhost:8080/health -Verbose
# Test request ID
Invoke-RestMethod -Uri http://localhost:8080/health -Headers @{"X-Request-ID"="test-123"}# Cross-platform (Go tools)
# Check Go environment
go env
# Check dependencies
go list -m all
# Why is package included?
go mod why github.com/prometheus/client_golang
# Dependency graph
go mod graph | grep prometheus# macOS/Linux/WSL
# View Prometheus metrics
curl http://localhost:8080/metrics
# Filter specific metrics
curl http://localhost:8080/metrics | grep http_requests_total
# Parse metrics
curl -s http://localhost:8080/metrics | \
grep http_requests_total | \
grep method
# Watch metrics change
watch -n 1 'curl -s http://localhost:8080/metrics | grep http_requests_total'# Windows PowerShell
# View Prometheus metrics
Invoke-RestMethod -Uri http://localhost:8080/metrics
# Filter specific metrics
Invoke-RestMethod -Uri http://localhost:8080/metrics | Select-String "http_requests_total"
# Watch metrics change (refresh every second)
while ($true) {
Invoke-RestMethod -Uri http://localhost:8080/metrics | Select-String "http_requests_total"
Start-Sleep -Seconds 1
}# macOS/Linux/WSL — install hey and run load tests
go install github.com/rakyll/hey@latest
hey -n 1000 -c 10 http://localhost:8080/health
hey -n 1000 -c 10 -m POST \
-H "Content-Type: application/json" \
-d '{"text":"load test"}' \
http://localhost:8080/predict
hey -z 30s -c 20 http://localhost:8080/health
# Save results
hey -n 1000 -c 10 http://localhost:8080/health > results.txt
# Using Apache Bench
ab -n 1000 -c 10 http://localhost:8080/health# Windows PowerShell — install hey then fix PATH
go install github.com/rakyll/hey@latest
# Add Go bin to PATH for the current session if hey isn't found
$env:PATH += ";$env:USERPROFILE\go\bin"
# Verify
hey --version
# Basic load test
hey -n 1000 -c 10 http://localhost:8080/health
# POST load test
hey -n 1000 -c 10 -m POST -H "Content-Type: application/json" -d '{"text":"load test"}' http://localhost:8080/predict# Windows PowerShell — alternative without hey (built-in load test)
$url = "http://localhost:8080/health"
$requests = 100
$start = Get-Date
1..$requests | ForEach-Object {
Invoke-RestMethod -Uri $url | Out-Null
}
$elapsed = (Get-Date) - $start
Write-Host "$requests requests in $($elapsed.TotalSeconds)s ($([math]::Round($requests / $elapsed.TotalSeconds, 1)) req/s)"# macOS/Linux/WSL
# Stop running gateway
pkill gateway
# Remove built binaries
rm gateway gateway-linux# Windows PowerShell
# Stop running gateway
Stop-Process -Name gateway -Force -ErrorAction SilentlyContinue
# Remove built binaries
Remove-Item -Force gateway.exe, gateway-linux -ErrorAction SilentlyContinue# Cross-platform
# Delete Kubernetes resources
kubectl delete -f deployment.yaml
# Delete by label
kubectl delete all -l app=api-gateway
# Remove Docker image
docker rmi api-gateway:v1
# Clean Go cache
go clean -cache
go clean -modcacheIf you get stuck, a complete reference implementation is available in solution/:
-
gateway_solution.go- All TODOs completed with detailed comments
Note: Try to complete the exercise on your own first!
Once you've completed the exercise and tests pass:
→ Module 5: Kubeflow Pipelines
In Module 5, you'll build ML pipelines for training automation!
- ✅ Go HTTP Server: Building production HTTP services
- ✅ Reverse Proxy: Forwarding requests to backend services
- ✅ Structured Logging: JSON logs for observability
- ✅ Metrics: Prometheus instrumentation
- ✅ Middleware: Composable request processing
- ✅ Graceful Shutdown: Clean termination
- ✅ Polyglot Architecture: Right tool for each job
- Use Go for I/O-bound, routing, and gateway layers
- Use Python for ML inference and data science
- Always implement structured logging
- Instrument code with Prometheus metrics
- Propagate request IDs for tracing
- Handle graceful shutdown
- Set appropriate timeouts
- Test backend connectivity in health checks
This configuration is suitable for production ML APIs that need:
- High Performance: Low latency, high throughput
- Resource Efficiency: Minimal memory footprint
- Observability: Structured logs and metrics
- Reliability: Health checks and graceful shutdown
- Scalability: Independent scaling of gateway and ML service
Having issues? Check the Troubleshooting section or review the solution file!
| Previous | Home | Next |
|---|---|---|
| ← Module 3: Kubernetes Deployment | 🏠 Home | Module 5: Kubeflow Pipelines & Model Serving → |
MLOps Workshop | GitHub Repository