MLOps Project: Abalone Age Prediction

🎯 Project Overview

Welcome to your MLOps project! In this hands-on project, you'll build a complete machine learning system to predict the age of abalone (a type of sea snail) using physical measurements instead of the traditional time-consuming method of counting shell rings under a microscope.

Your Mission: Transform a simple ML model into a production-ready system with automated training, deployment, and prediction capabilities.

📊 About the Dataset

Traditionally, determining an abalone's age requires:

Cutting the shell through the cone
Staining it
Counting rings under a microscope (very time-consuming!)

Your Goal: Use easier-to-obtain physical measurements (shell weight, diameter, etc.) to predict the age automatically.

📥 Download: Get the dataset from the Kaggle page

🚀 Quick Start

Prerequisites

GitHub account
Kaggle account (for dataset download)
Python 3.10 or 3.11

Setup Steps

Fork this repository
- ⚠️ Important: Uncheck "Copy the main branch only" to get all project branches
Add your team members as admins to your forked repository

Set up your development environment:

# Create and activate a virtual environment
uv sync 
source venv/bin/activate # on Windows: venv\Scripts\activate

# Install pre-commit hooks for code quality
 uv pip install pre-commit
 uv run pre-commit install

📋 What You'll Build

By the end of this project, you'll have created:

🤖 Automated ML Pipeline

Training workflows using Prefect
Automatic model retraining on schedule
Reproducible model and data processing

🌐 Prediction API

REST API for real-time predictions
Input validation with Pydantic
Docker containerization

📊 Production-Ready Code

Clean, well-documented code
Automated testing and formatting
Proper error handling

📝 How to Work on This Project

The Branch-by-Branch Approach

This project is organized into numbered branches, each representing a step in building your MLOps system. Think of it like a guided tutorial where each branch teaches you something new!

Here's how it works:

Each branch = One pull request with specific tasks
Follow the numbers (branch_0, branch_1, etc.) in order
Read the PR instructions (PR_0.md, PR_1.md, etc.) before starting
Complete all TODOs in that branch's code
Create a pull request when done
Merge and move to the next branch

Step-by-Step Workflow

For each numbered branch:

# Switch to the branch
git checkout branch_number_i

# Get latest changes (except for branch_0)
git pull origin main
# Note: A VIM window might open - just type ":wq" to close it

# Push your branch
git push

Then:

📖 Read the PR_i.md file carefully
💻 Complete all the TODOs in the code
🔧 Test your changes
📤 Open ONE pull request to your main branch
✅ Merge the pull request
🔄 Move to the next branch

💡 Pro Tip: Always integrate your previous work when starting a new branch (except branch_0)!

🔍 Understanding Pull Requests

Pull Requests (PRs) are how you propose and review changes before merging them into your main codebase. They're essential for team collaboration!

Important: When creating a PR, make sure you're merging into YOUR forked repository, not the original:

❌ Wrong (merging to original repo):

✅ Correct (merging to your fork):

💡 Development Tips

Managing Dependencies

Use uv to manage dependencies. Install or update packages with:

uv add <package>==<version>

Then sync the environment and regenerate the dependency files:

uv sync

Code Quality

The pre-commit hooks will automatically format your code
Remove all TODOs and unused code before final submission
Use clear variable names and add docstrings

📊 Evaluation Criteria

Your project will be evaluated on:

🔍 Code Quality

Clean, readable code structure
Proper naming conventions
Good use of docstrings and type hints

🎨 Code Formatting

Consistent style (automated with pre-commit)
Professional presentation

⚙️ Functionality

Code runs without errors
All requirements implemented correctly

📖 Documentation & Reproducibility

Clear README with setup instructions
Team member names and GitHub usernames
Step-by-step instructions to run everything

🤝 Collaboration

Effective use of Pull Requests
Good teamwork and communication

🎯 Final Deliverables Checklist

When you're done, your repository should contain:

✅ Automated Training Pipeline

Prefect workflows for model training
Separate modules for training and inference
Reproducible model and encoder generation

✅ Automated Deployment

Prefect deployment for regular retraining

✅ Production API

Working REST API for predictions
Pydantic input validation
Docker containerization

✅ Professional Documentation

Updated README with team info
Clear setup and run instructions
All TODOs removed from code

Ready to start? Head to branch_0 and read PR_0.md for your first task! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MLOps Project: Abalone Age Prediction

🎯 Project Overview

📊 About the Dataset

🚀 Quick Start

Prerequisites

Setup Steps

📋 What You'll Build

🤖 Automated ML Pipeline

🌐 Prediction API

📊 Production-Ready Code

📝 How to Work on This Project

The Branch-by-Branch Approach

Step-by-Step Workflow

🔍 Understanding Pull Requests

💡 Development Tips

Managing Dependencies

Code Quality

📊 Evaluation Criteria

🔍 Code Quality

🎨 Code Formatting

⚙️ Functionality

📖 Documentation & Reproducibility

🤝 Collaboration

🎯 Final Deliverables Checklist

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

artefactory/xhec-mlops-2025-project

Folders and files

Latest commit

History

Repository files navigation

MLOps Project: Abalone Age Prediction

🎯 Project Overview

📊 About the Dataset

🚀 Quick Start

Prerequisites

Setup Steps

📋 What You'll Build

🤖 Automated ML Pipeline

🌐 Prediction API

📊 Production-Ready Code

📝 How to Work on This Project

The Branch-by-Branch Approach

Step-by-Step Workflow

🔍 Understanding Pull Requests

💡 Development Tips

Managing Dependencies

Code Quality

📊 Evaluation Criteria

🔍 Code Quality

🎨 Code Formatting

⚙️ Functionality

📖 Documentation & Reproducibility

🤝 Collaboration

🎯 Final Deliverables Checklist

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Packages