Skip to content

Move self-monitoring layers and private images to serverless-testing (093468662994)#1183

Draft
duncanista wants to merge 3 commits intomainfrom
jordan.gonzalez/publish-private-images-to-testing-account
Draft

Move self-monitoring layers and private images to serverless-testing (093468662994)#1183
duncanista wants to merge 3 commits intomainfrom
jordan.gonzalez/publish-private-images-to-testing-account

Conversation

@duncanista
Copy link
Copy Markdown
Contributor

@duncanista duncanista commented Apr 13, 2026

Overview

Move both self-monitoring layers and container images from the sandbox account (425362996713) to the serverless-testing account (093468662994), where the LOD/LMI self-monitoring runtimes live. Eliminates cross-account ECR pulls during CDK Docker builds and the cross-account Lambda layer query in the image build script.

After this PR, the self-monitoring test artifacts are entirely self-contained in 093468662994. The regular publish layer sandbox and publish layer prod jobs are untouched.

Changes

  • environments.yaml: add serverless_testing environment for 093468662994 (assumes role layer-deployer, externalId serverless-testing-publish-externalid, mirrors automatically_bump_version: 1 / add_layer_version_permissions: 0 from sandbox)
  • pipeline.yaml.tpl — two job changes:
    • publish private images: switch from sandbox env → serverless_testing env (push to new ECR)
    • publish layer [self-monitoring]: switch from sandbox env → serverless_testing env (publish Datadog-Extension layer to 093468662994 in us-east-1 + us-west-2)
  • build_private_image.sh:
    • Push to 093468662994.dkr.ecr.us-east-1.amazonaws.com/datadog-lambda-extension (parameterizable via PRIVATE_IMAGE_ECR_ACCOUNT / PRIVATE_IMAGE_ECR_REPO)
    • Drop the cross-account arn:aws:lambda:us-east-1:425362996713:layer:… lookup. Query the same account we publish to — works because publish layer [self-monitoring] now lives in that account too.

Prerequisites

  • ECR repo datadog-lambda-extension in 093468662994 — created by serverless-self-monitoring#637 (LVU CDK), already deployed manually
  • IAM role layer-deployer in 093468662994 with Lambda layer publish + ECR push perms — created by cloud-inventory#59058, already merged
  • Vault key serverless-testing-publish-externalid at kv/k8s/gitlab-runner/datadog-lambda-extension/secrets — created manually

Knock-on for serverless-self-monitoring

Layer-version-updater (latest-dev.json) currently pins Datadog-Extension to arn:aws:lambda:us-east-1:425362996713:layer:Datadog-Extension:…. After this PR's first run, the next "self-monitoring" extension layer is published to 093468662994 instead — LVU will need to learn to query 093468662994 for the Datadog-Extension dev layer. Tracked as a follow-up; safe because the existing 425362996713 layers don't disappear, they just stop receiving new versions from the [self-monitoring] job.

Testing

  • Generated pipeline YAML has serverless_testing environment for both publish private images and publish layer [self-monitoring]
  • Trigger manual publish layer [self-monitoring] on a test pipeline → confirm Datadog-Extension layer published in 093468662994 (us-east-1, us-west-2)
  • Trigger manual publish private images on the same pipeline → confirm image pushed to 093468662994/datadog-lambda-extension:<VERSION> with version matching the layer just published
  • Verify LOD/LMI can pull from 093468662994 ECR during CDK deploy

Update build_private_image.sh to push to 093468662994
(serverless-testing) instead of 425362996713 (sandbox).

The self-monitoring container runtimes (LOD, LMI) run in
093468662994, so co-locating the extension images there
removes the need for cross-account ECR pulls during CDK
Docker builds.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the CI “publish private images” workflow to publish Lambda extension container images into the serverless-testing AWS account (093468662994) to avoid cross-account ECR pulls during downstream builds.

Changes:

  • Switch the “publish private images” GitLab job to assume the serverless_testing environment role.
  • Update build_private_image.sh to target 093468662994.dkr.ecr.us-east-1.amazonaws.com/datadog-lambda-extension by default (configurable via env vars) and to look up the next tag by querying the sandbox layer ARN.
  • Add a new serverless_testing entry to the environments datasource.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
.gitlab/templates/pipeline.yaml.tpl Makes the private image publish job assume the serverless-testing environment instead of sandbox.
.gitlab/scripts/build_private_image.sh Changes ECR destination defaults and updates layer-version lookup logic for image tagging.
.gitlab/datasources/environments.yaml Adds the serverless_testing environment definition (account/role/external-id).

Comment thread .gitlab/datasources/environments.yaml
Comment thread .gitlab/scripts/build_private_image.sh Outdated
Comment on lines 33 to 34
latest_version=$(aws lambda list-layer-versions --region us-east-1 --layer-name "arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}" --query 'LayerVersions[0].Version || `0`')
VERSION=$(($latest_version + 1))
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script now calls aws lambda list-layer-versions against a sandbox layer ARN (425362996713), but the CI job that runs it was updated to assume the serverless_testing role. Unless that role is granted lambda:ListLayerVersions on the sandbox layer resource, this call will fail with AccessDenied and the image publish job will exit (due to set -e).

Either ensure the lambda-extension-image-publisher role has cross-account permission to list versions for the sandbox layer ARN(s), or adjust the workflow so the layer version lookup is performed with sandbox credentials (e.g., assume the sandbox role just for this lookup).

Suggested change
latest_version=$(aws lambda list-layer-versions --region us-east-1 --layer-name "arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}" --query 'LayerVersions[0].Version || `0`')
VERSION=$(($latest_version + 1))
SANDBOX_LAYER_ARN="arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}"
if [ -n "${SANDBOX_LAYER_LOOKUP_ROLE_ARN:-}" ]; then
printf "Assuming sandbox role for layer version lookup in account %s...\n" "$SANDBOX_ACCOUNT"
assume_role_output=$(aws sts assume-role \
--region us-east-1 \
--role-arn "$SANDBOX_LAYER_LOOKUP_ROLE_ARN" \
--role-session-name "build-private-image-layer-lookup" \
--query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]' \
--output text)
read -r sandbox_access_key_id sandbox_secret_access_key sandbox_session_token <<EOF
$assume_role_output
EOF
latest_version=$(
AWS_ACCESS_KEY_ID="$sandbox_access_key_id" \
AWS_SECRET_ACCESS_KEY="$sandbox_secret_access_key" \
AWS_SESSION_TOKEN="$sandbox_session_token" \
aws lambda list-layer-versions \
--region us-east-1 \
--layer-name "$SANDBOX_LAYER_ARN" \
--query 'LayerVersions[0].Version || `0`' \
--output text
)
else
latest_version=$(aws lambda list-layer-versions \
--region us-east-1 \
--layer-name "$SANDBOX_LAYER_ARN" \
--query 'LayerVersions[0].Version || `0`' \
--output text)
fi
if ! [[ "$latest_version" =~ ^[0-9]+$ ]]; then
printf "Failed to resolve a numeric sandbox layer version for %s. Configure SANDBOX_LAYER_LOOKUP_ROLE_ARN with a role that can call lambda:ListLayerVersions on the sandbox layer, or grant the current role that permission.\n" "$SANDBOX_LAYER_ARN" >&2
exit 1
fi
VERSION=$((latest_version + 1))

Copilot uses AI. Check for mistakes.
@duncanista duncanista marked this pull request as draft April 14, 2026 16:36
duncanista added 2 commits May 6, 2026 15:54
Aligns with the role created in cloud-inventory#59058, which mirrors
the existing 'sandbox-layer-deployer' / 'dd-serverless-layer-deployer-role'
naming convention.
The 'publish layer [self-monitoring]' job now publishes Datadog-Extension
layers to the serverless-testing account (093468662994) instead of sandbox
(425362996713), matching where 'publish private images' pushes container
images. The build_private_image.sh layer-version lookup is now local
(same account as the push) — no more cross-account read.

This makes self-monitoring artifacts (test layers + container images)
self-contained in 093468662994. Other 'publish layer' jobs (sandbox, prod)
are unchanged.
@duncanista duncanista changed the title Publish private extension images to serverless-testing account (093468662994) Move self-monitoring layers and private images to serverless-testing (093468662994) May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants