-
Notifications
You must be signed in to change notification settings - Fork 42
Module 5
By the end of this module, you will have:
- ✅ Running Kubeflow Pipelines on local Kubernetes
- ✅ Working ML pipeline (data prep → train → evaluate → deploy)
- ✅ Model deployed as REST API with KServe
- ✅ Understanding of production ML orchestration
- What Kubeflow Pipelines and KServe are and when to use them
- How to build reusable pipeline components
- How to create end-to-end ML workflows
- How to deploy models as scalable APIs
- How to integrate deployed models into applications
Before starting, ensure you have:
-
Docker Desktop installed and running
docker --version docker ps # Should connect without error -
kubectl (Kubernetes CLI)
kubectl version --client
-
kind (Kubernetes in Docker)
kind version
-
Python 3.9-3.13
python --version -
8GB+ RAM available for the Kubernetes cluster
- Check Docker Desktop > Settings > Resources
# macOS / Linux / WSL
cd modules/module-5 && ./scripts/install-kubeflow.sh# Windows PowerShell
cd modules/module-5; .\scripts\install-kubeflow.ps1What this does:
- Creates kind cluster named
mlops-workshop(if not exists) - Installs cert-manager (required for Kubeflow)
- Installs Kubeflow Pipelines v2.14.3
- Patches minio with compatible image
- Waits for all components to start
Expected output:
✓ kind cluster created/verified
✓ Kubeflow Pipelines installed
✓ Waiting for pods to be ready...
✓ Installation complete!
Next steps:
1. kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80
2. Open http://localhost:8080
Installation takes 5-10 minutes on first run. Monitor progress:
# Watch pod status (Ctrl+C to exit when all Running)
kubectl get pods -n kubeflow -wAll pods should show Running with 1/1 or 2/2 READY:
NAME READY STATUS RESTARTS AGE
cache-server-xxx 2/2 Running 0 3m
metadata-envoy-deployment-xxx 1/1 Running 0 3m
metadata-grpc-deployment-xxx 2/2 Running 0 3m
minio-xxx 2/2 Running 0 3m
ml-pipeline-xxx 2/2 Running 0 3m
ml-pipeline-ui-xxx 2/2 Running 0 3m
mysql-xxx 2/2 Running 0 3m
workflow-controller-xxx 2/2 Running 0 3m
# In a separate terminal, keep this running
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80Open browser: http://localhost:8080
You should see the Kubeflow Pipelines dashboard.
# macOS / Linux / WSL
cd modules/module-5
python3 -m venv venv
source venv/bin/activate# Windows PowerShell
cd modules/module-5
python -m venv venv
venv\Scripts\Activate.ps1# Navigate to starter directory
cd starter
# Install all required packages
pip install -r requirements.txtWhat gets installed:
-
kfp==2.14.3- Kubeflow Pipelines SDK -
pandas,numpy,scikit-learn- ML libraries -
setuptools,wheel- Build tools (for Python 3.13+)
pip show kfp
python3 -c "from kfp.dsl import component, pipeline; print('KFP imports working')"# Windows PowerShell
pip show kfp
python -c "from kfp.dsl import component, pipeline; print('KFP imports working')"Run the verification script:
# macOS / Linux / WSL
cd modules/module-5 && ./scripts/verify-installation.sh# Windows PowerShell
cd modules/module-5; .\scripts\verify-installation.ps1Expected output:
✓ kubectl installed
✓ kind cluster running
✓ kubeflow namespace exists
✓ All pods running
✓ ml-pipeline-ui service available
✓ All checks passed!
If any checks fail, see Troubleshooting
Traditional ML workflow:
python prepare_data.py
python train_model.py --data=./data/train.csv
python evaluate.py --model=./models/model.pkl
kubectl apply -f deployment.yaml # if model is goodProblems:
- ❌ Not reproducible (hard to rerun exactly)
- ❌ No tracking (which data? which code?)
- ❌ Manual (human runs each step)
- ❌ No dependency management (what if step fails?)
- ❌ Hard to share with team
Kubeflow Pipelines turns your ML workflow into an automated, reproducible graph:
┌─────────────┐
│ Data Prep │ → train_data, test_data
└──────┬──────┘
↓
┌─────────────┐
│ Train Model │ → trained_model
└──────┬──────┘
↓
┌─────────────┐ ┌─────────────┐
│ Evaluate │←───┤ test_data │
└──────┬──────┘ └─────────────┘
↓
┌─────────────┐
│ Deploy │ → REST API
│ (KServe) │
└─────────────┘
Benefits:
- ✅ Reproducible - Same code + data + parameters = same results
- ✅ Automated - One click to run entire workflow
- ✅ Tracked - All inputs, outputs, and metrics logged
- ✅ Scalable - Runs on Kubernetes, auto-scales
- ✅ Shareable - Export as YAML, anyone can run
Components are self-contained code blocks that run in isolated containers:
from kfp.dsl import component, Output, Dataset
@component(
base_image="python:3.11-slim",
packages_to_install=["pandas==2.0.3"]
)
def prepare_data(output_data: Output[Dataset]):
"""Download and prepare data"""
import pandas as pd
data = pd.read_csv("https://example.com/data.csv")
data.to_csv(output_data.path, index=False)Key features:
- Runs in own container (isolated)
- Declares dependencies (
pandas) - Typed outputs (
Dataset) - Reusable across pipelines
Pipelines connect components into a workflow (DAG):
from kfp.dsl import pipeline
@pipeline(name="my-ml-pipeline")
def my_pipeline():
# Step 1: Prepare data
prep_task = prepare_data()
# Step 2: Train (uses step 1 output)
train_task = train_model(
data=prep_task.outputs["output_data"]
)
# Ensure order
train_task.after(prep_task)Automatic features:
- Runs steps in correct order
- Passes data between steps
- Tracks all inputs/outputs
- Handles failures
Artifacts are typed data passed between components:
| Type | Purpose | Example |
|---|---|---|
Dataset |
Training/test data | CSV files |
Model |
Trained models | Pickle, SavedModel |
Metrics |
Performance metrics | Accuracy, RMSE |
How it works:
def train_model(
train_data: Input[Dataset], # Read from previous step
model: Output[Model] # Write for next step
):
df = pd.read_csv(train_data.path)
trained = fit(df)
pickle.dump(trained, open(model.path, 'wb'))- Kubeflow stores artifacts in Minio (S3-compatible storage)
- Components read/write using
.path - Automatic lineage tracking
After training a model, you need to:
- Create HTTP server for predictions
- Handle scaling (0 to many replicas)
- Manage deployments (blue/green, canary)
- Monitor performance
Doing this manually is complex!
KServe is a Kubernetes-native platform that turns your model into a production REST API:
Your Model (model.pkl)
↓ Deploy
┌──────────────────────────────┐
│ KServe InferenceService │
│ │
│ HTTP: /v1/models/NAME:predict
│ │
│ Auto-scaling: 0 → many pods │
│ Monitoring: metrics, logs │
└──────────────────────────────┘
↓
Your App calls API
What you get:
- ✅ Standard API - All models use same format
- ✅ Auto-scaling - Scale to zero (save $), scale up on traffic
- ✅ Canary deployments - Test new versions with % of traffic
- ✅ Monitoring - Request logs, latency, errors
All KServe models use this standard format:
Request:
POST /v1/models/MODEL_NAME:predict
{
"instances": [
{"user_id": 1, "n_recommendations": 5}
]
}Response:
{
"predictions": [
{
"user_id": 1,
"recommendations": [
{"movie_id": 50, "movie_name": "Star Wars", "score": 0.89}
]
}
]
}┌────────────────────────────────────────────┐
│ Kubeflow Pipeline │
│ │
│ Data Prep → Train → Evaluate → Deploy │
│ ↓ │
│ model.pkl │
└──────────────────────┬─────────────────────┘
│ Creates
↓
┌────────────────────────────────────────────┐
│ KServe InferenceService │
│ │
│ REST API: http://service:8080/predict │
│ │
│ Your App → [Request] → [Model] → [Response]
└────────────────────────────────────────────┘
Complete workflow:
- Kubeflow Pipeline trains and evaluates model
- Deploy component creates KServe InferenceService
- KServe starts serving model as HTTP API
- Your application calls API for predictions
Goal: Build a component to download and prepare the MovieLens dataset.
cd modules/module-5/starter
# Open in VS Code
code components/data_prep.pyYou need to implement:
- TODO 1 — Set
base_image="python:3.11-slim"andpackages_to_install=["pandas==2.0.3", "scikit-learn==1.3.2"]in the@componentdecorator - TODO 2 — Set
dataset_urlto"https://files.grouplens.org/datasets/movielens/ml-100k.zip" - TODO 3 — Download the zip using
urllib.request.urlretrieve(dataset_url, zip_path) - TODO 4 — Load ratings with
pd.read_csv("/tmp/ml-100k/u.data", sep='\t', names=['userId','movieId','rating','timestamp'], engine='python') - TODO 5 — Split data:
train_df, test_df = train_test_split(ratings, test_size=test_ratio, random_state=random_state) - TODO 6 — Save training data:
train_df.to_csv(train_data.path, index=False) - TODO 7 — Save test data:
test_df.to_csv(test_data.path, index=False)
Key concepts to use:
-
Output[Dataset]for outputs -
.pathattribute to get file path -
pd.read_csv()anddf.to_csv()
Stuck? Look at solution/components/data_prep_solution.py
Goal: Create components for training a collaborative filtering model and evaluating it.
cd modules/module-5/starter
code components/train.pyWhat you'll implement:
- TODO 1 — Set
base_image="python:3.11-slim"andpackages_to_install=["pandas==2.0.3", "numpy==1.24.3", "scikit-learn==1.3.2"]in the@componentdecorator - TODO 2 — Load training data:
ratings_df = pd.read_csv(train_data.path) - TODO 3 — Create SVD model:
TruncatedSVD(n_components=n_components, random_state=random_state) - TODO 4 — Fit and transform:
user_factors = svd_model.fit_transform(user_movie_matrix) - TODO 5 — Get movie factors:
movie_factors = svd_model.components_.T - TODO 6 — Log RMSE:
metrics.log_metric("rmse", float(rmse)) - TODO 7 — Log explained variance:
metrics.log_metric("explained_variance", float(explained_variance)) - TODO 8 — Log additional metrics:
n_components,n_users,n_movies,n_ratings - TODO 9 — Save model:
with open(model.path, 'wb') as f: pickle.dump(model_data, f)
Key concepts:
-
Input[Dataset]for inputs,Output[Model]for model -
Output[Metrics]for logging to Kubeflow - Matrix factorization with SVD
code components/evaluate.pyWhat you'll implement:
- TODO 1 — Set
base_image="python:3.11-slim"andpackages_to_install=["pandas==2.0.3", "numpy==1.24.3", "scikit-learn==1.3.2"] - TODO 2 — Load model:
with open(model.path, 'rb') as f: model_data = pickle.load(f) - TODO 3 — Load test data:
test_df = pd.read_csv(test_data.path) - TODO 4 — Log test RMSE:
metrics.log_metric("test_rmse", float(rmse)) - TODO 5 — Log test MAE:
metrics.log_metric("test_mae", float(mae)) - TODO 6 — Log coverage metrics:
user_coverage,movie_coverage,n_test_ratings,n_evaluated_ratings
Stuck? Look at solution/components/train_solution.py and evaluate_solution.py
Goal: Connect all components into an end-to-end pipeline.
cd modules/module-5/starter
code recommendation_pipeline.pyYou'll create a pipeline that:
- TODO 1 — Set
name="movie-recommendation-pipeline"anddescriptionin the@pipelinedecorator - TODO 2 — Call
prepare_data(dataset_size=dataset_size, test_ratio=test_ratio, random_state=random_state)and store asdata_prep_task - TODO 3 — Set display name:
data_prep_task.set_display_name("Prepare MovieLens Data") - TODO 4 — Call
train_model(train_data=data_prep_task.outputs["train_data"], n_components=n_components, random_state=random_state)and store astrain_task - TODO 5 — Set display name and dependency:
train_task.set_display_name("Train Recommendation Model")andtrain_task.after(data_prep_task) - TODO 6 — Call
evaluate_model(test_data=data_prep_task.outputs["test_data"], model=train_task.outputs["model"])and store aseval_task - TODO 7 — Set display name and dependency:
eval_task.set_display_name("Evaluate Model")andeval_task.after(train_task) - TODO 8 — Compile:
compiler.Compiler().compile(pipeline_func=recommendation_pipeline, package_path=output_path)
Pipeline structure:
@pipeline(name="movie-recommendation-pipeline")
def recommendation_pipeline():
# Step 1
data_task = prepare_data()
# Step 2 (uses data from step 1)
train_task = train_model(
train_data=data_task.outputs["train_data"]
)
# Step 3 (uses test data and model)
eval_task = evaluate_model(
test_data=data_task.outputs["test_data"],
model=train_task.outputs["model"]
)Before deploying with KServe, we need to create a Docker image that can serve predictions.
cd modules/module-5/model
lsFiles:
-
Dockerfile- Container definition -
serve.py- Flask server implementing KServe v2 protocol -
recommender.py- Movie recommendation logic -
requirements.txt- Python dependencies
How it works:
# serve.py
@app.route("/v1/models/recommender:predict", methods=["POST"])
def predict():
request_data = request.get_json()
user_id = request_data["instances"][0]["user_id"]
# Load model from minio and predict
recommendations = model.recommend_movies(user_id)
return jsonify({"predictions": [recommendations]})cd modules/module-5/model
# Build image
docker build -t movie-recommender:latest .What this does:
- Uses Python 3.11 slim base image
- Installs pandas, scikit-learn, numpy, boto3
- Copies serve.py and recommender.py
- Exposes port 8080
- Starts Flask server
IMPORTANT: Knative Serving (used by KServe) tries to resolve image digests from Docker Hub. For local images, we need to use a special prefix:
# Tag with kind.local prefix (bypasses digest resolution)
docker tag movie-recommender:latest kind.local/movie-recommender:latestWhy kind.local?
- Knative config has
registries-skipping-tag-resolving: "kind.local,ko.local,dev.local" - Images with these prefixes skip digest resolution
- Allows local images to work without pushing to registry
# Load image into kind cluster
kind load docker-image kind.local/movie-recommender:latest --name mlops-workshop
# Verify image is loaded
docker exec mlops-workshop-control-plane crictl images | grep movie-recommenderExpected output:
kind.local/movie-recommender latest abc123def456 50MB
KServe needs permissions to create InferenceServices:
cd modules/module-5
# Install Kserve
./scripts/install-kserve.sh
# Apply RBAC fix
kubectl apply -f kserve/rbac-fix.yamlVerify permissions:
kubectl auth can-i create inferenceservices --as=system:serviceaccount:kubeflow:pipeline-runner -n defaultShould return: yes
cd modules/module-5/solution
# Compile solution pipeline (includes deploy component)
python recommendation_pipeline_solution.py --output pipeline_with_deploy.yaml-
Upload
pipeline_with_deploy.yamlto Kubeflow UI -
Create run with parameters:
-
deploy_model_flag: True (enable deployment) - Other parameters: use defaults
-
- Start the run
Pipeline will:
- Prepare data (5 min)
- Train model (3 min)
- Evaluate model (2 min)
- Deploy to KServe (2 min)
# Check InferenceService
kubectl get inferenceservice -n default
# Expected output:
# NAME URL READY
# movie-recommender http://movie-recommender.default... TrueCheck predictor pods:
kubectl get pods -n default | grep movie-recommender
# Expected: 1-2 predictor pods runningView logs:
# Get pod name
kubectl get pods -n default -l serving.kserve.io/inferenceservice=movie-recommender
# View logs
kubectl logs -n default <pod-name>Get the full pod name first:
kubectl get pods -n default | grep movie-recommenderThen port-forward (use port 8082 to avoid conflict with Kubeflow UI on 8080):
# Run in a separate terminal and keep it open
kubectl port-forward -n default pod/<pod-name-from-above> 8082:8080# macOS / Linux / WSL
curl -X POST http://localhost:8080/v1/models/recommender:predict -H "Content-Type: application/json" -d '{"instances": [{"user_id": 1, "n_recommendations": 5}]}'# Windows PowerShell
Invoke-RestMethod -Method Post -Uri "http://localhost:8080/v1/models/recommender:predict" -ContentType "application/json" -Body '{"instances": [{"user_id": 1, "n_recommendations": 5}]}' | ConvertTo-Json -Depth 10Expected response:
{
"predictions": [
{
"user_id": 1,
"recommendations": [
{
"movie_id": 50,
"movie_name": "Star Wars (1977)",
"score": 0.89,
"genres": ["Action", "Adventure", "Sci-Fi"]
},
{
"movie_id": 181,
"movie_name": "Return of the Jedi (1983)",
"score": 0.87,
"genres": ["Action", "Adventure", "Sci-Fi"]
}
]
}
]
}If you only see movie_id (no names/genres), see Re-running Pipeline
curl -X POST http://localhost:8082/v1/models/recommender:predict -H "Content-Type: application/json" -d '{"instances": [{"user_id": 42, "n_recommendations": 3}]}'# Windows PowerShell
Invoke-RestMethod -Method Post -Uri "http://localhost:8082/v1/models/recommender:predict" -ContentType "application/json" -Body '{"instances": [{"user_id": 42, "n_recommendations": 3}]}' | ConvertTo-Json -Depth 10class MovieRecommenderClient {
constructor(baseUrl = 'http://localhost:8080') {
this.baseUrl = baseUrl;
this.predictUrl = `${baseUrl}/v1/models/recommender:predict`;
}
async getRecommendations(userId, nRecommendations = 10, genre = null) {
const payload = {
instances: [
{
user_id: userId,
n_recommendations: nRecommendations
}
]
};
// Add genre filter if specified
if (genre) {
payload.instances[0].genre = genre;
}
const response = await fetch(this.predictUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(payload)
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const data = await response.json();
return data.predictions[0];
}
}
// Usage
const client = new MovieRecommenderClient();
client.getRecommendations(1, 5)
.then(result => {
result.recommendations.forEach(rec => {
console.log(`${rec.movie_name} (${rec.score.toFixed(2)})`);
console.log(` Genres: ${rec.genres.join(', ')}`);
});
})
.catch(error => console.error('Error:', error));The service supports filtering recommendations by genre:
- Action
- Adventure
- Animation
- Children
- Comedy
- Crime
- Documentary
- Drama
- Fantasy
- Film-Noir
- Horror
- Musical
- Mystery
- Romance
- Sci-Fi
- Thriller
- War
- Western
Get only Action movies:
curl -X POST http://localhost:8082/v1/models/recommender:predict -H "Content-Type: application/json" -d '{"instances": [{"user_id": 1, "n_recommendations": 5, "genre": "Action"}]}'# Windows PowerShell
Invoke-RestMethod -Method Post -Uri "http://localhost:8082/v1/models/recommender:predict" -ContentType "application/json" -Body '{"instances": [{"user_id": 1, "n_recommendations": 5, "genre": "Action"}]}' | ConvertTo-Json -Depth 10Get Comedy movies:
curl -X POST http://localhost:8082/v1/models/recommender:predict -H "Content-Type: application/json" -d '{"instances": [{"user_id": 1, "n_recommendations": 5, "genre": "Comedy"}]}'# Windows PowerShell
Invoke-RestMethod -Method Post -Uri "http://localhost:8082/v1/models/recommender:predict" -ContentType "application/json" -Body '{"instances": [{"user_id": 1, "n_recommendations": 5, "genre": "Comedy"}]}' | ConvertTo-Json -Depth 10How it works:
- Case-insensitive partial matching
-
"sci"matches"Sci-Fi" -
"action"matches"Action" - Only returns movies with matching genre
During installation you will see these warnings in the MinIO pod logs:
WARNING: MINIO_ACCESS_KEY and MINIO_SECRET_KEY are deprecated.
Please use MINIO_ROOT_USER and MINIO_ROOT_PASSWORD
This is normal. The Kubeflow manifests still use the old environment variable names, but MinIO accepts them and starts successfully. You can ignore these warnings — MinIO will be running and healthy.
Error:
minio-xxx 0/2 CrashLoopBackOff
Cause: Incompatible minio image (especially on ARM/Apple Silicon)
Fix:
kubectl set image deployment/minio -n kubeflow minio=minio/minio:RELEASE.2025-09-07T16-13-09Z-cpuv1
kubectl rollout status deployment/minio -n kubeflow
kubectl get pods -n kubeflow | grep minio# Windows PowerShell
kubectl set image deployment/minio -n kubeflow minio=minio/minio:RELEASE.2025-09-07T16-13-09Z-cpuv1
kubectl rollout status deployment/minio -n kubeflow
kubectl get pods -n kubeflow | Select-String "minio"Error:
Cannot import 'setuptools.build_meta'
Cause: Python 3.13 doesn't include setuptools by default
Fix:
# Install setuptools first
pip install --upgrade pip setuptools wheel
# Then install requirements
pip install -r requirements.txt
# Verify
pip show setuptools # Should show 65.0.0+Error:
ApiException: (403) Forbidden
User 'system:serviceaccount:kubeflow:pipeline-runner' cannot create inferenceservices
Cause: Missing RBAC permissions
Fix:
# Apply RBAC
kubectl apply -f kserve/rbac-fix.yaml
# Verify
kubectl auth can-i create inferenceservices \
--as=system:serviceaccount:kubeflow:pipeline-runner \
-n default
# Should return: yesError:
Unable to fetch image "movie-recommender:latest"
failed to resolve image to digest: 401 Unauthorized
Cause: Knative tries to pull from Docker Hub, but image is local
Fix:
# 1. Tag with kind.local prefix
docker tag movie-recommender:latest kind.local/movie-recommender:latest
# 2. Load to kind
kind load docker-image kind.local/movie-recommender:latest --name mlops-workshop
# 3. Verify loaded
docker exec mlops-workshop-control-plane crictl images | Select-String "movie-recommender"
# 4. Update InferenceService
kubectl set image inferenceservice/movie-recommender -n default kserve-container=kind.local/movie-recommender:latest
# OR delete and recreate
kubectl delete inferenceservice movie-recommender -n default
# Then re-run pipeline with deploy_model_flag=True# Access UI
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80
# View pods
kubectl get pods -n kubeflow
# View pipeline runs
kubectl get pipelineruns -n kubeflow
# View component logs
kubectl logs <pod-name> -n kubeflow
# Restart Kubeflow
kubectl rollout restart deployment -n kubeflow# Delete pipeline run
kubectl delete pipelinerun <run-name> -n kubeflow
# Delete InferenceService
kubectl delete inferenceservice movie-recommender -n default
# Uninstall Kubeflow
kubectl delete namespace kubeflow
kubectl delete namespace cert-manager
# Delete kind cluster
kind delete cluster --name mlops-workshop✅ Automated ML Workflows - End-to-end pipelines in code
✅ Component Reusability - Build once, use everywhere
✅ Artifact Tracking - Automatic versioning and lineage
✅ Production Deployment - Models as REST APIs with KServe
✅ Enterprise Orchestration - Kubernetes-native ML platform
| Feature | Manual | With Kubeflow |
|---|---|---|
| Workflow | Manual steps | Automated pipeline |
| Tracking | None | All artifacts logged |
| Reproducibility | Hard | One-click rerun |
| Deployment | Manual kubectl | Automated with KServe |
| Collaboration | Hard to share | Share YAML file |
| Visibility | Logs only | Visual graph + metrics |
Extend your pipeline:
- Add hyperparameter tuning component (GridSearchCV)
- Implement A/B testing with canary deployments (10% traffic)
- Add data validation (Great Expectations)
- Create model monitoring dashboard (Prometheus + Grafana)
Production considerations:
- Set up persistent artifact storage (S3, GCS)
- Configure resource limits and auto-scaling
- Implement CI/CD for pipeline updates (GitHub Actions)
- Add authentication and authorization
Continue learning:
- Kubeflow Pipelines Documentation
- KFP SDK v2 Guide
- KServe Documentation
- Collaborative Filtering Tutorial
| Previous | Home | Next |
|---|---|---|
| ← Module 4: API Gateway & Polyglot Architecture | 🏠 Home | Module 6: Monitoring & Observability → |
MLOps Workshop | GitHub Repository
Congratulations! 🎉
You've mastered enterprise ML workflow orchestration with Kubeflow Pipelines and model serving with KServe!