Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions .claude/skills/ptq/references/slurm-setup-ptq.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,16 @@ If enroot import fails (e.g., permission errors on lustre), use pyxis inline pul

### Container dependency pitfalls

**New models may need newer transformers** than what's in the container. Install from source inside the job script:
**New models may need newer transformers** than what's in the container. Install from PyPI inside the job script (unset `PIP_CONSTRAINT` first if needed — see below):

```bash
pip install git+https://github.com/huggingface/transformers.git --quiet
pip install -U transformers
```

Only install from git if the fix you need isn't in a released version yet:

```bash
pip install git+https://github.com/huggingface/transformers.git
```

**Prefer `PYTHONPATH`** to use the synced ModelOpt source instead of installing inside the container — this avoids risking dependency conflicts (e.g., `pip install -U nvidia-modelopt[hf]` can upgrade PyTorch and break other packages):
Expand Down Expand Up @@ -90,4 +96,4 @@ This catches script errors cheaply before using GPU quota on a real run.

See `skills/common/slurm-setup.md` section 2 for the smoke test partition pattern.

Only submit the full calibration job after the smoke test exits cleanly.
Only submit the full calibration job after the smoke test exits cleanly.