This repository contains Docker configurations for various development tasks.
The repository is organized as a set of docker configuration directories.
dockers/<docker-dir>/
Each directory contains relevant Dockerfiles, Docker Compose files, and related scripts for a specific environment or tool.
dockers/infer-dev is a PeiDocker-based CUDA dev container that can auto-launch llama-server on startup.
# Keep durable edits in user_config.persist.yml, then copy to user_config.yml
cp dockers/infer-dev/src/user_config.persist.yml dockers/infer-dev/src/user_config.yml
# Regenerate generated artifacts under dockers/infer-dev/src/
cd dockers/infer-dev
./pei-configure.sh --with-mergeddocker compose -f dockers/infer-dev/src/docker-compose.yml build stage-1
docker compose -f dockers/infer-dev/src/docker-compose.yml build stage-2The entry hook can auto-start llama-server instances, but it is off by default.
- Set
AUTO_INFER_LLAMA_CPP_ON_BOOT=1(ortrue) to enable auto-start on boot. - Set
AUTO_INFER_LLAMA_CPP_CONFIGto point at a TOML file with instance definitions. - If auto-start is disabled, you can run
/soft/app/llama-cpp/check-and-run-llama-cpp.shmanually inside the container.
Example (GLM-4.7 Q2_K):
- Config:
dockers/infer-dev/model-configs/glm-4.7-q2k.toml - Host port
11980→ container port8080(seedockers/infer-dev/src/docker-compose.yml)
Run with the env var set (publish service ports and mount the config directory into the container):
docker compose -f dockers/infer-dev/src/docker-compose.yml run -d --service-ports --name infer-glm \
-v "$PWD/dockers/infer-dev/model-configs:/model-configs:ro" \
-e AUTO_INFER_LLAMA_CPP_ON_BOOT=1 \
-e AUTO_INFER_LLAMA_CPP_CONFIG=/model-configs/glm-4.7-q2k.toml \
stage-2 sleep infinityVerify:
curl http://127.0.0.1:11980/v1/models
curl http://127.0.0.1:11980/v1/chat/completions -H 'Content-Type: application/json' -d '{
"model": "glm4",
"messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 64
}'Notes:
- The sample config mounts a specific model directory to
/llm-models/...(not the entire host model tree); adjustdockers/infer-dev/src/user_config.persist.yml+ rerun./dockers/infer-dev/pei-configure.shto test other models. AUTO_INFER_LLAMA_CPP_PKG_PATH+AUTO_INFER_LLAMA_CPP_GET_PKG_ON_BOOT=1|trueinstalls a prebuilt llama.cpp bundle into/soft/app/llama-cppon boot (archive is cached under/soft/app/cache). If auto-install is off, run/soft/app/llama-cpp/get-llama-cpp-pkg.shinside the container.