remove reliance on rapidsai/miniforge-cuda (and therefore ci-imgs), upgrade cuvs-bench-cpu to miniforge 25.9.1-0#836
Conversation
3972d33 to
48075ba
Compare
| ARG RAPIDS_VER=26.02 | ||
|
|
||
| FROM condaforge/miniforge3:24.11.3-2 AS cuvs-bench-cpu | ||
| FROM condaforge/miniforge3:${MINIFORGE_VER} AS cuvs-bench-cpu |
There was a problem hiding this comment.
Calling out that this is technically a breaking change, because this updates the condaforge/miniforge3 used as a base for rapidsai/cuvs-bench-cpu from 24.11.3-2 to 25.9.1-0.
I think that's acceptable, especially because the builds do a rapids-mamba-retry update --all -y -n base a few lines down, but just calling it out.
I tested this using the command below, with a CPU-only algorithm and the smallest dataset listed at https://github.com/rapidsai/cuvs/blob/bae4cdbd0003c1572c0043541ff9826a2628762a/docs/source/cuvs_bench/index.rst
IMAGE_URI="docker.io/rapidsai/staging:docker-cuvs-bench-cpu-836-26.02a-py3.10-amd64@sha256:9026429656d09df7e2f59e75d6c6c4db0a112251378cce7577c45e8547404346"
docker run --rm -it \
"${IMAGE_URI}" \
"--dataset fashion-mnist-784-euclidean" \
"--normalize" \
"--algorithms hnswlib --batch-size 10 -k 10" \
""And it seemed to run ok.
| ARG RAPIDS_VER=26.02 | ||
|
|
||
| FROM rapidsai/miniforge-cuda:${RAPIDS_VER}-cuda${CUDA_VER}-base-${LINUX_VER}-py${PYTHON_VER} AS cuvs-bench | ||
| # --- begin 'rapidsai/miniforge-cuda' --- # |
There was a problem hiding this comment.
This is a lot of boilerplate to include twice. We're sure this is better than having a common miniforge-cuda image and using that?
There was a problem hiding this comment.
I really really think it is worth it.
Being able to focus https://github.com/rapidsai/ci-imgs only on RAPIDS CI and not need to think about how changes there will affect users on Brev, Databricks, Sagemaker, etc. is well worth the separation duplication.
Also these images get published to NGC and go through more rigorous compliance and security scanning than the CI images, so it's helpful to know we can make changes in https://github.com/rapidsai/ci-imgs that won't endanger our ability to release these on-time.
It'll be a lot less boilerplate in a follow-up PR where I remove the things that are unnecessary in this context (for example, we don't build RockyLinux images here so anything about that OS can be cut out). I intentionally made this PR almost a straight copy-paste of the Dockerfiles so we could have high confidence this wasn't changing the resulting images too much.
I think there's also significant opportunity to move some of the identical code into scripts in shared context, instead of having them repeated across the Dockerfiles.
There was a problem hiding this comment.
Great! Thanks for the context.
|
/merge |
…ications (#839) Contributes to #832 This has a collection of small updates, mostly simplifications as a follow-up to #836 ## Adds docs on building locally Expands the docs on building locally so they can be copy-pasteable. I tested all of these on an amd64 Linux machine and they worked. Hopefully this will help with testing changes. ## Removes anything about Rocky Linux Some Rocky Linux details were copied over from https://github.com/rapidsai/ci-imgs in #836. We only build Ubuntu images in this repo, so this removes those to simplify things a bit. ## Combines some `RUN` steps I saw a few opportunities to combine `RUN` steps that I think are non-controversial. e.g. combining these should not affect caching: https://github.com/rapidsai/ci-imgs/blob/0143b2bde68d3fc35648df0cd56415a657ae7590/ci-conda.Dockerfile#L144-L152 Each `RUN` step creates a layer, and there's some fixed per-layer overhead for building, pushing, and pulling. This should make builds a little faster, images a little smaller, and image pulls a bit more reliable. ## Adds more `container-canary` checks A few environment variables in these images are part of the public API (for example, `dask-labextension` configuration). This adds `container-canary` checks on them to ensure they aren't accidentally broken by future refactorings. ## Other Changes See inline comments. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #839
Closes #832 Follow-up to #836 (comment) In #836, this repo's dependency on https://github.com/rapidsai/ci-imgs was removed by replacing `FROM rapidsai/miniforge-cuda` with the inline contents of the `Dockerfile` used to produce that image. That introduced a lot of duplication between the Dockerfiles there. This proposes reducing much of that duplication by moving shared code into scripts that are mounted in at build time. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #840
Contributes to #832
Removes reliance on
rapidsai/miniforge-cudahere, and therefore fully decouples this project from https://github.com/rapidsai/ci-imgs.Notes for Reviewers
I've done the following here:
FROM rapidsai/miniforge-cudawith the relevant contents of https://github.com/rapidsai/ci-imgs/blob/main/ci-conda.Dockerfilesccache, to avoid needing to pass around a GitHub tokenci-conda.DockerfileunchangedThere are things in the diff that look unnecessary (like handling the possibility of rockylinux8 when we only build Ubuntu images here)... let's defer those cleanups to follow-up PRs after this.