ci: Optimize docker layer for caching and add remote cache#1437
ci: Optimize docker layer for caching and add remote cache#1437
Conversation
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
| client-id: ${{ inputs.AZURE_CLIENT_ID }} | ||
| tenant-id: ${{ inputs.AZURE_TENANT_ID }} | ||
| subscription-id: ${{ inputs.AZURE_SUBSCRIPTION_ID }} |
There was a problem hiding this comment.
Input reference using uppercase instead of lowercase will fail
| client-id: ${{ inputs.AZURE_CLIENT_ID }} | |
| tenant-id: ${{ inputs.AZURE_TENANT_ID }} | |
| subscription-id: ${{ inputs.AZURE_SUBSCRIPTION_ID }} | |
| client-id: ${{ inputs.azure-client-id }} | |
| tenant-id: ${{ inputs.azure-tenant-id }} | |
| subscription-id: ${{ inputs.azure-subscription-id }} |
| ${{ env.container-registry }}/${{ env.REPO_LOWER }}:${{ fromJSON(steps.get-pr-info.outputs.pr-info || '{}').number || 0 }} | ||
| ${{ env.container-registry }}/${{ env.REPO_LOWER }}:${{ github.sha }} | ||
| secrets: | | ||
| GH_TOKEN=${{ secrets.PAT }} |
There was a problem hiding this comment.
Should reference input parameter not secret in composite action - change secrets.PAT to inputs.PAT
Additional Comments (2)
|
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
| - name: Checkout | ||
| uses: actions/checkout@v4 | ||
| with: |
There was a problem hiding this comment.
Empty with block will cause the checkout action to fail or behave unexpectedly
| - name: Checkout | |
| uses: actions/checkout@v4 | |
| with: | |
| - name: Checkout | |
| uses: actions/checkout@v4 |
Additional Comments (2)
|
Additional Comments (2)
|
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
| } | ||
| } | ||
| }' | jq -r '.data.repository.pullRequests.nodes[].number' | while read -r number; do | ||
| echo "type=registry,ref=${{ env.container-registry }}/${{ env.REPO_LOWER }}:$number-buildcache,mode=max" |
There was a problem hiding this comment.
${{ env.REPO_LOWER }} is empty here - it's set in step "Normalize repo name to lowercase" which runs after this step. Use ${{ inputs.repo-name }} with bash lowercasing in the same script instead.
| echo "type=registry,ref=${{ env.container-registry }}/${{ env.REPO_LOWER }}:$number-buildcache,mode=max" | |
| echo "type=registry,ref=${{ env.container-registry }}/$(echo ${{ inputs.repo-name }} | tr '[:upper:]' '[:lower:]'):$number-buildcache,mode=max" |
| apt-get update | ||
| apt-get install -y gh |
There was a problem hiding this comment.
apt-get requires sudo in GitHub Actions runners
| apt-get update | |
| apt-get install -y gh | |
| sudo apt-get update | |
| sudo apt-get install -y gh |
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
| apt-get update | ||
| apt-get install -y gh |
There was a problem hiding this comment.
Missing sudo - inconsistent with Azure CLI installation at line 66 which uses sudo bash. Will fail if runner doesn't have root privileges.
| apt-get update | |
| apt-get install -y gh | |
| sudo apt-get update | |
| sudo apt-get install -y gh |
| apt-get update | ||
| apt-get install -y uuid-runtime |
There was a problem hiding this comment.
Missing sudo - inconsistent with Azure CLI installation at line 76 which uses sudo bash. Will fail if runner doesn't have root privileges.
| apt-get update | |
| apt-get install -y uuid-runtime | |
| sudo apt-get update | |
| sudo apt-get install -y uuid-runtime |
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
| apt-get update | ||
| apt-get install -y gh |
There was a problem hiding this comment.
apt-get requires sudo - inconsistent with Azure CLI installation which uses sudo bash
| apt-get update | |
| apt-get install -y gh | |
| sudo apt-get update | |
| sudo apt-get install -y gh |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| apt-get update | ||
| apt-get install -y uuid-runtime |
There was a problem hiding this comment.
apt-get requires sudo - inconsistent with Azure CLI installation which uses sudo bash
| apt-get update | |
| apt-get install -y uuid-runtime | |
| sudo apt-get update | |
| sudo apt-get install -y uuid-runtime |
| } | ||
| } | ||
| }' | jq -r '.data.repository.pullRequests.nodes[].number' | while read -r number; do | ||
| echo "type=registry,ref=${{ env.container-registry }}/${{ env.REPO_LOWER }}:$number-buildcache,mode=max" |
There was a problem hiding this comment.
${{ env.REPO_LOWER }} empty here - the step "Normalize repo name to lowercase" runs after this step (line 87). The variable is set at line 92 but not available until subsequent steps. Will produce malformed cache references like nemoci.azurecr.io/:123-buildcache
| type=registry,ref=${{ env.container-registry }}/${{ env.REPO_LOWER }}:${{ fromJSON(steps.get-pr-info.outputs.pr-info || '{}').number || 0 }}-buildcache,mode=max | ||
| type=registry,ref=${{ env.container-registry }}/${{ env.REPO_LOWER }}:main-buildcache,mode=max | ||
| ${{ steps.cache_from.outputs.LAST_PRS }} |
There was a problem hiding this comment.
${{ env.REPO_LOWER }} empty here - it's set in the step at line 87-92 which runs before this, but env variables aren't available in subsequent steps within the same job in composite actions. Will produce malformed image references
| - name: Checkout repository | ||
| uses: actions/checkout@v2 | ||
| with: | ||
| path: NeMo-Curator |
There was a problem hiding this comment.
Checkout path NeMo-Curator doesn't match repository name Curator
| RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \ | ||
| --mount=type=cache,target=/var/lib/apt,sharing=locked \ | ||
| --mount=type=cache,target=/root/.cache/uv \ | ||
| uv sync --link-mode copy --locked --extra all --all-groups --no-cache && \ |
There was a problem hiding this comment.
Using --no-cache flag negates the --mount=type=cache,target=/root/.cache/uv above it, preventing uv from utilizing its cache during sync
Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
| apt-get update | ||
| apt-get install -y gh |
There was a problem hiding this comment.
apt-get requires sudo in GitHub Actions runners
| apt-get update | |
| apt-get install -y gh | |
| sudo apt-get update | |
| sudo apt-get install -y gh |
| apt-get update | ||
| apt-get install -y uuid-runtime |
There was a problem hiding this comment.
apt-get requires sudo in GitHub Actions runners
| apt-get update | |
| apt-get install -y uuid-runtime | |
| sudo apt-get update | |
| sudo apt-get install -y uuid-runtime |
| }' | jq -r '.data.repository.pullRequests.nodes[].number' | while read -r number; do | ||
| echo "type=registry,ref=${{ env.container-registry }}/${{ env.REPO_LOWER }}:$number-buildcache,mode=max" |
There was a problem hiding this comment.
${{ env.REPO_LOWER }} is empty here - it's set in the next step at line 92, but environment variables aren't available until steps after they're set. This produces malformed cache references like nemoci.azurecr.io/:123-buildcache
Use inline bash lowercasing instead:
echo "type=registry,ref=${{ env.container-registry }}/${REPO,,}:$number-buildcache,mode=max"
| type=registry,ref=${{ env.container-registry }}/${{ env.REPO_LOWER }}:${{ fromJSON(steps.get-pr-info.outputs.pr-info || '{}').number || 0 }}-buildcache,mode=max | ||
| type=registry,ref=${{ env.container-registry }}/${{ env.REPO_LOWER }}:main-buildcache,mode=max | ||
| ${{ steps.cache_from.outputs.LAST_PRS }} | ||
| cache-to: | | ||
| type=registry,ref=${{ env.container-registry }}/${{ env.REPO_LOWER }}:${{ fromJSON(steps.get-pr-info.outputs.pr-info || '{}').number || 0 }}-buildcache,mode=max | ||
| no-cache: false | ||
| tags: | | ||
| ${{ env.container-registry }}/${{ env.REPO_LOWER }}:${{ fromJSON(steps.get-pr-info.outputs.pr-info || '{}').number || 0 }} | ||
| ${{ env.container-registry }}/${{ env.REPO_LOWER }}:${{ github.sha }} |
There was a problem hiding this comment.
${{ env.REPO_LOWER }} is empty at line 130-131 - it's set at line 92 but composite actions don't make env variables available to subsequent steps. Cache and tag references will be malformed like nemoci.azurecr.io/:abc123
Set REPO_LOWER as output from the normalize step and reference ${{ steps.normalize.outputs.repo_lower }} instead
| - name: Checkout repository | ||
| uses: actions/checkout@v2 | ||
| with: | ||
| path: NeMo-Curator |
There was a problem hiding this comment.
Checkout path NeMo-Curator doesn't match repository name curator
| path: NeMo-Curator | |
| path: curator |
| @@ -116,7 +119,7 @@ runs: | |||
| --volume $(pwd)/NeMo-Curator:/workspace \ | |||
There was a problem hiding this comment.
Volume mount path references NeMo-Curator but should be curator to match the repository name
| --volume $(pwd)/NeMo-Curator:/workspace \ | |
| --volume $(pwd)/curator:/workspace \ |
| # Install Curator | ||
| RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \ | ||
| --mount=type=cache,target=/var/lib/apt,sharing=locked \ | ||
| uv sync --link-mode copy --locked --extra all --all-groups --no-cache && \ |
There was a problem hiding this comment.
--no-cache flag negates the --mount=type=cache,target=/var/cache/apt above it, preventing the cache mount from functioning
| uv sync --link-mode copy --locked --extra all --all-groups --no-cache && \ | |
| uv sync --link-mode copy --locked --extra all --all-groups && \ |
Description
Improve docker layer caching and add container build stage that push and pulls to remote cache
Usage
# Add snippet demonstrating usageChecklist