This repository implements the LATTICE spec in LATTICE-SPEC.md as a dual-GPU,
PyTorch DDP project. Each rank owns one GPU and runs generation, labeling, replay,
and learner work locally. DDP synchronizes model gradients through NCCL.
This project uses uv and keeps dependencies in .venv.
uv venv --python 3.12
uv sync --extra devInstall the GPU extra only after the base environment works:
uv sync --extra dev --extra gpuRun the environment check:
uv run lattice doctorThese commands are safe for this development PC. They do not start background services and the default alpha benchmark does not import torch or touch CUDA.
uv run lattice alpha-bench --profile local --output runs/local-alpha-bench.json
uv run lattice checkmate-alpha1 --no-require-vm-artifacts --no-require-openspec-completeThe readiness command above is a local contract check only. Full alpha readiness
requires the VM artifacts listed in docs/checkmate-alpha1.md.
Do not run these on a desktop used for gaming unless you intend to occupy the GPUs. Run them on the Thunder Compute VM or another dedicated CUDA machine.
The runtime smoke command validates the dual-GPU topology and exits. It does not start a long training job.
uv run torchrun --nproc_per_node=2 -m lattice.cli ddp-smokePrint the planned per-rank GPU buffer sizes without allocating them:
uv run lattice buffer-planVM-only commands live in docs/VM_RUNBOOK.md. Do not run timed smoke or profiler
jobs on a desktop used for gaming unless you intend to occupy the GPUs.
- DDP runtime skeleton and dual-GPU smoke command.
- LATTICE-base model and synthetic DDP learner.
- GPU chess state, legal move generation, move application, terminal checks, CPU differential tests, and throughput profiling.
- Rank-local VRAM replay and sparse negamax labeling.
- Integrated self-play training loop.
- Evaluation, profiling, and CUDA/Triton hot-path optimization.