Skip to content

SebastianScherer88/zenml-aws

Repository files navigation

Overview

This repository implements a collection of AWS integrations for the zenml platform.

Contains:

The remote components

Setup

Python dependencies

Create a virtual environment with python 3.12:

uv venv --python=3.12

and activate it. Then install all python dependencies:

uv sync

Infrastructure

Provision the pulumi stack in the AWS cloud, including a publically available RDS sql server for the remote zenml store. This needs to be run by an AWS identity that has the required pulumi provisioning permissions. My local setup achieves this by a designated PulumiDevRole holding all the required policies and/or permissions, and that can be assumed by a pululmi-bootstrap User. I've configured this setup using the below configuration files ~/.aws/config and ~/.aws/credentials:

[profile pulumi] # ~/.aws/config
role_arn = arn:aws:iam::743582000746:role/PulumiDevRole
source_profile = pulumi-bootstrap
region = eu-west-1
[pulumi-bootstrap] # ~/.aws/credentials
aws_access_key_id = <AWS_ACCESS_KEY_ID>
aws_secret_access_key = <AWS_SECRET_ACCESS_KEY>
export AWS_PROFILE=pulumi # 'set AWS_PROFILE=pulumi' on windows
cd infrastructure
pulumi up -y

Docker image

To authenticate your local docker client with the remote ECR stack you just provisioned, run:

aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin 743582000746.dkr.ecr.eu-west-1.amazonaws.com

To build a zenml docker image that can run remotely, run:

docker build -f infrastructure\docker\component\Dockerfile . -t 743582000746.dkr.ecr.eu-west-1.amazonaws.com/zenml:latest
docker push 743582000746.dkr.ecr.eu-west-1.amazonaws.com/zenml:latest

Zenml step-operator and orchestrator test stack

Create a zenml stack using this library's integrations by running the following commands.

Login with the remote SQL zenml store directly:

zenml login mysql://zenml:password@zenml-metdata-store992d729.c1cyu4q20nag.eu-west-1.rds.amazonaws.com:3306/zenml

Register the git repository as a local zenml repository:

zenml init

Register a remote type ECR contaier registry component:

zenml container-registry register aws-ecr -f aws --uri=743582000746.dkr.ecr.eu-west-1.amazonaws.com

Register a remote type S3 artifact store component:

zenml artifact-store register aws-s3 -f s3 --path=s3://zenml-artifact-store-29182ff

Batch Step Operator

zenml stack set default
zenml stack delete test-step-operator -y
zenml step-operator delete aws-batch
zenml step-operator flavor delete aws_batch
zenml step-operator flavor register zenml_aws.step_operator.aws_batch_step_operator_flavor.AWSBatchStepOperatorFlavor
zenml step-operator register aws-batch -f aws_batch --execution_role=arn:aws:iam::743582000746:role/batch-execution-role --job_role=arn:aws:iam::743582000746:role/batch-job-role --job_queue_name=zenml-test-fargate-job-queue --backend=FARGATE --tags="{\"test\": \"step-operator\"}" --assign_public_ip=ENABLED --timeout_seconds=900 --aws_profile=pulumi --delete_resources_on="[\"SUCCEEDED\"]" --log_group=/aws/batch/job
zenml stack register test-step-operator -a default -o default -c aws-ecr -s aws-batch -a aws-s3
zenml stack set test-step-operator

AWS Step operator component

For end-to-end tests running on the provisioned AWS infrastructure, run the test scripts in the scripts directory:

zenml stack set test-step-operator
python scripts/test_run_step_operator.py --backend EC2 --job-queue zenml-test-ec2-job-queue --memory 1000
python scripts/test_run_step_operator.py --backend FARGATE --job-queue zenml-test-fargate-job-queue --memory 2048

A pipeline using the AWS Batch step operator

Stepfunctions Orchestrator

zenml stack set default
zenml stack delete test-orchestrator -y
zenml step-operator delete aws-batch
zenml step-operator flavor delete aws_batch
zenml orchestrator delete aws-stepfunctions
zenml orchestrator flavor delete aws_stepfunctions
zenml step-operator flavor register zenml_aws.step_operator.aws_batch_step_operator_flavor.AWSBatchStepOperatorFlavor
zenml step-operator register aws-batch -f aws_batch --execution_role=arn:aws:iam::743582000746:role/batch-execution-role --job_role=arn:aws:iam::743582000746:role/batch-job-role --job_queue_name=zenml-test-fargate-job-queue --backend=FARGATE --tags="{\"test-1\": \"step-operator\"}" --assign_public_ip=ENABLED --timeout_seconds=900 --aws_profile=pulumi
zenml orchestrator flavor register zenml_aws.orchestrator.aws_stepfunctions_batch_orchestrator_flavor.AWSStepFunctionsOrchestratorFlavor
zenml orchestrator register aws-stepfunctions -f aws_stepfunctions --stepfunctions_execution_role=arn:aws:iam::743582000746:role/stepfunctions-execution-role --batch_execution_role=arn:aws:iam::743582000746:role/batch-execution-role --batch_job_role=arn:aws:iam::743582000746:role/batch-job-role --job_queue_name=zenml-test-fargate-job-queue --backend=FARGATE --tags="{\"test-2\": \"orchestrator\"}" --assign_public_ip=ENABLED --timeout_seconds=900 --aws_profile=pulumi --delete_stepfunctions_resource_on="[]" --batch_log_group=/aws/batch/job --stepfunctions_log_group_arn=arn:aws:logs:eu-west-1:743582000746:log-group:/aws/batch/job:*
zenml stack register test-orchestrator -a default -o aws-stepfunctions -c aws-ecr -a aws-s3 -s aws-batch
zenml stack set test-orchestrator

AWS Stepfunctions component

For end-to-end tests running on the provisioned AWS infrastructure, run the test scripts in the scripts directory:

zenml stack set test-orchestrator
python scripts/test_run_orchestrator.py --backend EC2 --job-queue zenml-test-ec2-job-queue --memory 1000
python scripts/test_run_orchestrator.py --backend FARGATE --job-queue zenml-test-fargate-job-queue --memory 2048

A pipeline using the AWS Stepfunctions orchestrator

Zenml dashboard (optional)

It can be useful to track the state of stacks and pipelines via the zenml dashboard running independently against the same SQL zenml store:

cd infrastructure
docker compose up

The username is default, and the password is empty.

Tests

For local only unit and integration tests, simply run the pytest test suites in the respective directories:

pytest tests/unit -vv # unit tests
pytest tests/integration -vv # integration tests

For end-to-end tests, see the sections above.

About

AWS integrations for the zenml platform

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors