Move self-monitoring layers and private images to serverless-testing (093468662994)#1183
Move self-monitoring layers and private images to serverless-testing (093468662994)#1183duncanista wants to merge 3 commits intomainfrom
Conversation
Update build_private_image.sh to push to 093468662994 (serverless-testing) instead of 425362996713 (sandbox). The self-monitoring container runtimes (LOD, LMI) run in 093468662994, so co-locating the extension images there removes the need for cross-account ECR pulls during CDK Docker builds.
There was a problem hiding this comment.
Pull request overview
Updates the CI “publish private images” workflow to publish Lambda extension container images into the serverless-testing AWS account (093468662994) to avoid cross-account ECR pulls during downstream builds.
Changes:
- Switch the “publish private images” GitLab job to assume the
serverless_testingenvironment role. - Update
build_private_image.shto target093468662994.dkr.ecr.us-east-1.amazonaws.com/datadog-lambda-extensionby default (configurable via env vars) and to look up the next tag by querying the sandbox layer ARN. - Add a new
serverless_testingentry to the environments datasource.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| .gitlab/templates/pipeline.yaml.tpl | Makes the private image publish job assume the serverless-testing environment instead of sandbox. |
| .gitlab/scripts/build_private_image.sh | Changes ECR destination defaults and updates layer-version lookup logic for image tagging. |
| .gitlab/datasources/environments.yaml | Adds the serverless_testing environment definition (account/role/external-id). |
| latest_version=$(aws lambda list-layer-versions --region us-east-1 --layer-name "arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}" --query 'LayerVersions[0].Version || `0`') | ||
| VERSION=$(($latest_version + 1)) |
There was a problem hiding this comment.
This script now calls aws lambda list-layer-versions against a sandbox layer ARN (425362996713), but the CI job that runs it was updated to assume the serverless_testing role. Unless that role is granted lambda:ListLayerVersions on the sandbox layer resource, this call will fail with AccessDenied and the image publish job will exit (due to set -e).
Either ensure the lambda-extension-image-publisher role has cross-account permission to list versions for the sandbox layer ARN(s), or adjust the workflow so the layer version lookup is performed with sandbox credentials (e.g., assume the sandbox role just for this lookup).
| latest_version=$(aws lambda list-layer-versions --region us-east-1 --layer-name "arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}" --query 'LayerVersions[0].Version || `0`') | |
| VERSION=$(($latest_version + 1)) | |
| SANDBOX_LAYER_ARN="arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}" | |
| if [ -n "${SANDBOX_LAYER_LOOKUP_ROLE_ARN:-}" ]; then | |
| printf "Assuming sandbox role for layer version lookup in account %s...\n" "$SANDBOX_ACCOUNT" | |
| assume_role_output=$(aws sts assume-role \ | |
| --region us-east-1 \ | |
| --role-arn "$SANDBOX_LAYER_LOOKUP_ROLE_ARN" \ | |
| --role-session-name "build-private-image-layer-lookup" \ | |
| --query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]' \ | |
| --output text) | |
| read -r sandbox_access_key_id sandbox_secret_access_key sandbox_session_token <<EOF | |
| $assume_role_output | |
| EOF | |
| latest_version=$( | |
| AWS_ACCESS_KEY_ID="$sandbox_access_key_id" \ | |
| AWS_SECRET_ACCESS_KEY="$sandbox_secret_access_key" \ | |
| AWS_SESSION_TOKEN="$sandbox_session_token" \ | |
| aws lambda list-layer-versions \ | |
| --region us-east-1 \ | |
| --layer-name "$SANDBOX_LAYER_ARN" \ | |
| --query 'LayerVersions[0].Version || `0`' \ | |
| --output text | |
| ) | |
| else | |
| latest_version=$(aws lambda list-layer-versions \ | |
| --region us-east-1 \ | |
| --layer-name "$SANDBOX_LAYER_ARN" \ | |
| --query 'LayerVersions[0].Version || `0`' \ | |
| --output text) | |
| fi | |
| if ! [[ "$latest_version" =~ ^[0-9]+$ ]]; then | |
| printf "Failed to resolve a numeric sandbox layer version for %s. Configure SANDBOX_LAYER_LOOKUP_ROLE_ARN with a role that can call lambda:ListLayerVersions on the sandbox layer, or grant the current role that permission.\n" "$SANDBOX_LAYER_ARN" >&2 | |
| exit 1 | |
| fi | |
| VERSION=$((latest_version + 1)) |
Aligns with the role created in cloud-inventory#59058, which mirrors the existing 'sandbox-layer-deployer' / 'dd-serverless-layer-deployer-role' naming convention.
The 'publish layer [self-monitoring]' job now publishes Datadog-Extension layers to the serverless-testing account (093468662994) instead of sandbox (425362996713), matching where 'publish private images' pushes container images. The build_private_image.sh layer-version lookup is now local (same account as the push) — no more cross-account read. This makes self-monitoring artifacts (test layers + container images) self-contained in 093468662994. Other 'publish layer' jobs (sandbox, prod) are unchanged.
Overview
Move both self-monitoring layers and container images from the sandbox account (
425362996713) to the serverless-testing account (093468662994), where the LOD/LMI self-monitoring runtimes live. Eliminates cross-account ECR pulls during CDK Docker builds and the cross-account Lambda layer query in the image build script.After this PR, the self-monitoring test artifacts are entirely self-contained in
093468662994. The regularpublish layer sandboxandpublish layer prodjobs are untouched.Changes
environments.yaml: addserverless_testingenvironment for093468662994(assumes rolelayer-deployer, externalIdserverless-testing-publish-externalid, mirrorsautomatically_bump_version: 1/add_layer_version_permissions: 0from sandbox)pipeline.yaml.tpl— two job changes:publish private images: switch fromsandboxenv →serverless_testingenv (push to new ECR)publish layer [self-monitoring]: switch fromsandboxenv →serverless_testingenv (publish Datadog-Extension layer to093468662994in us-east-1 + us-west-2)build_private_image.sh:093468662994.dkr.ecr.us-east-1.amazonaws.com/datadog-lambda-extension(parameterizable viaPRIVATE_IMAGE_ECR_ACCOUNT/PRIVATE_IMAGE_ECR_REPO)arn:aws:lambda:us-east-1:425362996713:layer:…lookup. Query the same account we publish to — works becausepublish layer [self-monitoring]now lives in that account too.Prerequisites
datadog-lambda-extensionin093468662994— created byserverless-self-monitoring#637(LVU CDK), already deployed manuallylayer-deployerin093468662994with Lambda layer publish + ECR push perms — created bycloud-inventory#59058, already mergedserverless-testing-publish-externalidatkv/k8s/gitlab-runner/datadog-lambda-extension/secrets— created manuallyKnock-on for serverless-self-monitoring
Layer-version-updater (
latest-dev.json) currently pinsDatadog-Extensiontoarn:aws:lambda:us-east-1:425362996713:layer:Datadog-Extension:…. After this PR's first run, the next "self-monitoring" extension layer is published to093468662994instead — LVU will need to learn to query093468662994for the Datadog-Extension dev layer. Tracked as a follow-up; safe because the existing425362996713layers don't disappear, they just stop receiving new versions from the[self-monitoring]job.Testing
serverless_testingenvironment for bothpublish private imagesandpublish layer [self-monitoring]publish layer [self-monitoring]on a test pipeline → confirm Datadog-Extension layer published in093468662994(us-east-1, us-west-2)publish private imageson the same pipeline → confirm image pushed to093468662994/datadog-lambda-extension:<VERSION>with version matching the layer just published093468662994ECR during CDK deploy