Skip to content

[2/N] PTQ skill change for transformers 5.0#1229

Merged
mxinO merged 14 commits intomainfrom
mxin/skill-evolve-1
Apr 11, 2026
Merged

[2/N] PTQ skill change for transformers 5.0#1229
mxinO merged 14 commits intomainfrom
mxin/skill-evolve-1

Conversation

@mxinO
Copy link
Copy Markdown
Contributor

@mxinO mxinO commented Apr 10, 2026

What does this PR do?

Type of change: Improve

Summary:

  • Update MoE Pattern 2 for transformers 5.0 unified fused experts (_QuantFusedExperts auto-detection)
  • Add PIP_CONSTRAINT workaround and PYTHONPATH guidance for NGC containers
  • Add pip error diagnostic tip (ResolutionImpossible ≠ network failure)
  • Remove duplicated warnings across files — single source of truth per topic

Changes by file:


File Change
references/slurm-setup-ptq.md Container dependency section: PYTHONPATH preferred, PIP_CONSTRAINT workaround, --no-deps fallback
references/unsupported-models.md MoE Pattern 2 updated for transformers 5.0 auto-detection. Pip install advice points to slurm-setup-ptq.md. Pip error diagnostic added
SKILL.md Common Pitfalls simplified — warnings point to references instead of duplicating

Usage

Testing

Tested on gemma4 dense and MoE models.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

  • Is this change backward compatible?: ✅ / ❌ / N/A
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
  • Did you write any new necessary tests?: ✅ / ❌ / N/A
  • Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

  • Documentation
    • Clarified Transformers-version checks (prefer config.json) and warned container upgrades can be blocked by PIP_CONSTRAINT; added pointer to remediation.
    • Shortened Docker/NFS guidance by cross-referencing setup docs instead of explicit commands.
    • Reworked SLURM/container workflow to prefer existing images and add an import → pull fallback.
    • Added in-job dependency remediation steps and clarified MoE auto-detection differences and pip conflict troubleshooting.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 10, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

PTQ docs revised: SKILL.md Common Pitfalls shortened and cross-referenced; slurm-setup-ptq.md now prefers existing .sqsh, adds enroot import with writable ENROOT paths and a Pyxis inline-pull fallback, and introduces in-job container dependency remediation; unsupported-models.md updates transformers compatibility checks and splits MoE handling by transformers version.

Changes

Cohort / File(s) Summary
SKILL guidance
​.claude/skills/ptq/SKILL.md
Simplified "Common Pitfalls": removed example-driven ModelOpt/transformers install steps; instruct to check config.json for transformers_version; warn that PIP_CONSTRAINT can block container upgrades; add cross-reference to PTQ SLURM guide.
SLURM container workflow
​.claude/skills/ptq/references/slurm-setup-ptq.md
Make .sqsh usage deterministic (--container-image) and skip import if present; otherwise create writable ENROOT_CACHE_PATH/ENROOT_DATA_PATH and run enroot import; add Pyxis inline-pull fallback when import fails; introduce "Container dependency pitfalls" with in-job remediation steps (upgrade/install transformers, prefer synced ModelOpt via PYTHONPATH, editable pip install -e ".[hf]" --no-build-isolation, unset PIP_CONSTRAINT, use pip --no-deps).
Unsupported models & MoE handling
​.claude/skills/ptq/references/unsupported-models.md
Replace ModelOpt-first install guidance with a clearer "check transformers compatibility" flow (pip install -U transformers then try AutoConfig.from_pretrained()); if still failing, install transformers main branch; split MoE auto-detection for transformers >= 5.0 vs < 5.0; remove _QuantQwen35MoeExperts example; add pip ResolutionImpossible diagnostics note.

Sequence Diagram(s)

sequenceDiagram
    participant Job as Job (SLURM)
    participant FS as Filesystem (.sqsh)
    participant Enroot as Enroot
    participant Pyxis as Pyxis (NGC)
    participant InJob as In-job (pip / PYTHONPATH)

    Job->>FS: check for existing .sqsh (--container-image)
    alt .sqsh exists
        Job->>Job: launch container from .sqsh
    else .sqsh missing
        Job->>Enroot: create writable ENROOT_CACHE_PATH/ENROOT_DATA_PATH and run enroot import
        alt import succeeds
            Enroot-->>Job: container ready
            Job->>Job: run job
        else import fails (permissions)
            Job->>Pyxis: inline pull using NGC URI (--container-image)
            Pyxis-->>Job: container available (re-pulled per job)
            Job->>Job: run job
        end
    end

    Note over Job,InJob: If container lacks needed deps
    Job->>InJob: attempt in-job fixes (install/upgrade transformers, set PYTHONPATH to prefer local sources, editable install (--no-build-isolation), unset PIP_CONSTRAINT, pip --no-deps)
    InJob-->>Job: success or surface dependency conflict diagnostics
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[2/N] PTQ skill change for transformers 5.0' accurately summarizes the main changes: updates to PTQ documentation and detection logic to support transformers 5.0, including MoE auto-detection, container setup, and error handling guidance.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Anti-Patterns ✅ Passed PR contains only documentation changes to markdown files in .claude/skills/ptq/ directory. No Python source code or configuration files were modified, so security check is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mxin/skill-evolve-1

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 10, 2026

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-11 03:53 UTC

mxinO added 2 commits April 10, 2026 07:42
Signed-off-by: Meng Xin <mxin@nvidia.com>
Signed-off-by: Meng Xin <mxin@nvidia.com>
@mxinO mxinO marked this pull request as ready for review April 10, 2026 07:54
@mxinO mxinO requested review from Edwardf0t1, Copilot and kaix-nv April 10, 2026 07:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates PTQ skill documentation to align with HuggingFace transformers 5.0 MoE auto-detection, and consolidates container/dependency troubleshooting guidance into shared reference docs for SLURM/container users.

Changes:

  • Refreshes MoE Pattern 2 documentation to cover transformers 5.0 unified fused experts auto-detection (_QuantFusedExperts).
  • Adds/centralizes container dependency troubleshooting (PYTHONPATH guidance, PIP_CONSTRAINT / pip conflict workarounds).
  • Removes duplicated “pitfalls” guidance by pointing to single-source reference pages.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
.claude/skills/ptq/SKILL.md Simplifies “Common Pitfalls” and points to reference docs for container upgrade blockers.
.claude/skills/ptq/references/unsupported-models.md Updates MoE Pattern 2 guidance for transformers 5.0+ and adds a pip error diagnostic tip.
.claude/skills/ptq/references/slurm-setup-ptq.md Revises container acquisition steps and adds a dependency pitfalls section (PYTHONPATH, PIP_CONSTRAINT, --no-deps).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.03%. Comparing base (3baa2da) to head (dbf725c).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1229      +/-   ##
==========================================
- Coverage   76.03%   76.03%   -0.01%     
==========================================
  Files         350      350              
  Lines       40469    40537      +68     
==========================================
+ Hits        30772    30822      +50     
- Misses       9697     9715      +18     
Flag Coverage Δ
unit 55.53% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.claude/skills/ptq/references/slurm-setup-ptq.md (1)

34-35: Consider pinning transformers version when installing from git.

Installing from the main branch without a version pin could introduce breaking changes or instability. Consider either:

  • Pinning to a specific commit SHA, or
  • Adding a note that users should verify compatibility after install
📌 Suggested addition
 ```bash
-pip install git+https://github.com/huggingface/transformers.git --quiet
+pip install git+https://github.com/huggingface/transformers.git@<commit-or-tag> --quiet
+# Or install main but verify: pip show transformers
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 34 - 35, The
pip install command "pip install
git+https://github.com/huggingface/transformers.git --quiet" should not point to
the mutable main branch; update that invocation to pin to a stable commit or tag
(e.g., append @) or add a nearby note advising users to verify
compatibility after installing from main (e.g., run pip show transformers and
test), so change the installer string and/or add the compatibility note in
slurm-setup-ptq.md.


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:

  • Around line 34-35: The pip install command "pip install
    git+https://github.com/huggingface/transformers.git --quiet" should not point to
    the mutable main branch; update that invocation to pin to a stable commit or tag
    (e.g., append @) or add a nearby note advising users to verify
    compatibility after installing from main (e.g., run pip show transformers and
    test), so change the installer string and/or add the compatibility note in
    slurm-setup-ptq.md.

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `c343acc6-ed06-4ef8-8c86-00189a7f4f40`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 3baa2da62e695e210d00f95ca7f06ac2a69bef71 and 429b4f63543f93aab012ff8cf5c58ebdd1780d37.

</details>

<details>
<summary>📒 Files selected for processing (3)</summary>

* `.claude/skills/ptq/SKILL.md`
* `.claude/skills/ptq/references/slurm-setup-ptq.md`
* `.claude/skills/ptq/references/unsupported-models.md`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Signed-off-by: Meng Xin <mxin@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Around line 31-35: Replace the unpinned VCS pip install line that currently
uses "git+https://github.com/huggingface/transformers.git" with a pinned ref to
a stable tag or commit (e.g., append "@vX.Y.Z" or "@<commit-hash>") so the job
script installs a reproducible transformers version; update the pip invocation
referenced in the Slurm job setup (the pip install git+... line) to use the
chosen tag/commit and document the chosen ref in the job script comments.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 32e4b411-e386-4929-b338-dfb0838033f1

📥 Commits

Reviewing files that changed from the base of the PR and between 429b4f6 and ea0c193.

📒 Files selected for processing (2)
  • .claude/skills/ptq/SKILL.md
  • .claude/skills/ptq/references/slurm-setup-ptq.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • .claude/skills/ptq/SKILL.md

@shengliangxu
Copy link
Copy Markdown
Collaborator

The AI comments are quite good. Fix them and then good to go.

Copy link
Copy Markdown
Contributor

@Edwardf0t1 Edwardf0t1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGM in general, left a few comments.

…add CWD note

Signed-off-by: Meng Xin <mxin@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
.claude/skills/ptq/references/slurm-setup-ptq.md (1)

37-41: ⚠️ Potential issue | 🟡 Minor

Pin the git fallback example to an immutable ref (current example is still unpinned).

The text says to pin, but the example command still installs from a moving target. Please make the sample itself pinned so it’s reproducible.

Suggested doc fix
-If the model requires an unreleased fix not yet on PyPI, fall back to installing from git (pin to a tag or commit when possible):
+If the model requires an unreleased fix not yet on PyPI, fall back to installing from git (always pin to a tag or commit):
 
 ```bash
-pip install -U "git+https://github.com/huggingface/transformers.git"
+pip install -U "git+https://github.com/huggingface/transformers.git@<tag-or-commit>"
</details>

  

```web
What do pip VCS installation docs and Hugging Face Transformers installation docs recommend for reproducible installs (pinned tag/commit vs tracking main/default branch)?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 37 - 41, The
example pip VCS install uses a moving target; update the fallback example that
currently shows "git+https://github.com/huggingface/transformers.git" to pin to
an immutable ref by appending @<tag-or-commit> (e.g.,
"git+https://github.com/huggingface/transformers.git@<tag-or-commit>") and
update the text to recommend preferring a released pip package or a pinned
tag/commit (not tracking main) for reproducible installs, referencing the
existing git URL string in the snippet.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Around line 37-41: The example pip VCS install uses a moving target; update
the fallback example that currently shows
"git+https://github.com/huggingface/transformers.git" to pin to an immutable ref
by appending @<tag-or-commit> (e.g.,
"git+https://github.com/huggingface/transformers.git@<tag-or-commit>") and
update the text to recommend preferring a released pip package or a pinned
tag/commit (not tracking main) for reproducible installs, referencing the
existing git URL string in the snippet.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3ac8c68a-f535-4040-93b8-7ea925acee8a

📥 Commits

Reviewing files that changed from the base of the PR and between ea0c193 and 25166ae.

📒 Files selected for processing (2)
  • .claude/skills/ptq/references/slurm-setup-ptq.md
  • .claude/skills/ptq/references/unsupported-models.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • .claude/skills/ptq/references/unsupported-models.md

mxinO added 2 commits April 11, 2026 01:39
… tag/commit for git

Signed-off-by: Meng Xin <mxin@nvidia.com>
…ull flow in unsupported-models

Signed-off-by: Meng Xin <mxin@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
.claude/skills/ptq/references/slurm-setup-ptq.md (4)

45-45: Clarify which command requires running from the repo root.

The phrase "run from the Model-Optimizer repo root" is somewhat ambiguous. Make it explicit that this refers to the pip install -e command in the previous sentence.

📝 Suggested clarification
-If `PYTHONPATH` doesn't work due to missing compiled extensions, fall back to `pip install -e ".[hf]" --no-build-isolation` (run from the Model-Optimizer repo root).
+If `PYTHONPATH` doesn't work due to missing compiled extensions, fall back to an editable install (from the Model-Optimizer repo root):
+
+```bash
+pip install -e ".[hf]" --no-build-isolation
+```
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md at line 45, Clarify that
the instruction to "run from the Model-Optimizer repo root" applies specifically
to the pip install command; update the sentence so it explicitly states that the
pip install -e ".[hf]" --no-build-isolation command must be executed from the
Model-Optimizer repository root to ensure editable install finds local package
metadata and compiled extensions.

8-27: Consider acknowledging the Docker (non-pyxis) alternative.

The container setup section assumes pyxis/enroot availability. Readers on clusters without pyxis/enroot may not realize that skills/common/slurm-setup.md documents a Docker alternative. Adding a brief note pointing to that section would help users quickly identify which pattern applies to their cluster.

📝 Suggested addition
 ## 1. Container
 
+> **Note**: This section assumes pyxis/enroot is available. For clusters using plain `docker run` instead, see `skills/common/slurm-setup.md` section on the Docker (non-pyxis) variant.
+
 Get the recommended image version from `examples/llm_ptq/README.md`, then look for an existing `.sqsh` file:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 8 - 27, Add a
brief note in the "1. Container" section acknowledging clusters that lack
pyxis/enroot and pointing readers to the Docker alternative documented in
skills/common/slurm-setup.md; specifically update the paragraph around the
enroot import and pyxis inline-pull guidance (references: enroot import,
--container-image, pyxis inline pull) to mention "or use the Docker alternative
described in skills/common/slurm-setup.md" so users know which pattern applies
to their cluster.

31-52: Consider clarifying the relationship between the two transformers install commands.

Lines 34 and 51 both show pip install -U transformers, but in different contexts (general upgrade vs PIP_CONSTRAINT workaround). Readers might not realize these address different scenarios. Consider adding a cross-reference or clarifying note.

📝 Suggested clarification
 **New models may need newer transformers** than what's in the container:
 
 ```bash
 pip install -U transformers

+> If this fails with ResolutionImpossible, see the PIP_CONSTRAINT workaround below.


Or alternatively, at line 47:

```diff
-**Watch for pip dependency conflicts** — NGC containers set `PIP_CONSTRAINT` to pin versions, causing `ResolutionImpossible` errors. Unset it first so pip can resolve freely:
+**Watch for pip dependency conflicts** — NGC containers set `PIP_CONSTRAINT` to pin versions, causing `ResolutionImpossible` errors when upgrading transformers (above). Unset it first so pip can resolve freely:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 31 - 52,
Clarify that the two occurrences of `pip install -U transformers` address
different failure modes: the first (general upgrade) is the normal step to
update transformers, and the second (under the PIP_CONSTRAINT section) is shown
after unsetting `PIP_CONSTRAINT` to resolve dependency
pinning/ResolutionImpossible errors; add a short one-line cross-reference or
note near the first `pip install -U transformers` (or next to the PIP_CONSTRAINT
section) stating “If this fails with ResolutionImpossible due to pinned
constraints, see the PIP_CONSTRAINT workaround below” so readers understand the
relationship and when to use each command.

27-27: Clarify the trade-off of the pyxis inline pull fallback.

The note mentions re-pulling on every job but doesn't explain the implications clearly. Users should understand this is a last resort because it wastes bandwidth and time, versus a one-time enroot import that creates a reusable .sqsh.

📝 Suggested clarification
-If enroot import fails (e.g., permission errors on lustre), use pyxis inline pull as fallback — pass the NGC URI directly to `--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>"`. Note this re-pulls on every job.
+If enroot import fails (e.g., permission errors on lustre), use pyxis inline pull as fallback — pass the NGC URI directly to `--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>"`. **Note**: this re-pulls the image on every job (wasting bandwidth and startup time), so only use it when `.sqsh` creation is not possible.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md at line 27, Update the note
about the pyxis inline pull fallback to clearly state the trade-off: explain
that passing the NGC URI via
--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>" causes Pyxis
to re-pull the container on every job (increasing network bandwidth and job
startup time), so it should be used only as a last-resort when enroot import
fails (e.g., permission errors on Lustre); contrast this with enroot import
which creates a reusable .sqsh image with a one-time download and is preferred
for performance and bandwidth savings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Line 45: Clarify that the instruction to "run from the Model-Optimizer repo
root" applies specifically to the pip install command; update the sentence so it
explicitly states that the pip install -e ".[hf]" --no-build-isolation command
must be executed from the Model-Optimizer repository root to ensure editable
install finds local package metadata and compiled extensions.
- Around line 8-27: Add a brief note in the "1. Container" section acknowledging
clusters that lack pyxis/enroot and pointing readers to the Docker alternative
documented in skills/common/slurm-setup.md; specifically update the paragraph
around the enroot import and pyxis inline-pull guidance (references: enroot
import, --container-image, pyxis inline pull) to mention "or use the Docker
alternative described in skills/common/slurm-setup.md" so users know which
pattern applies to their cluster.
- Around line 31-52: Clarify that the two occurrences of `pip install -U
transformers` address different failure modes: the first (general upgrade) is
the normal step to update transformers, and the second (under the PIP_CONSTRAINT
section) is shown after unsetting `PIP_CONSTRAINT` to resolve dependency
pinning/ResolutionImpossible errors; add a short one-line cross-reference or
note near the first `pip install -U transformers` (or next to the PIP_CONSTRAINT
section) stating “If this fails with ResolutionImpossible due to pinned
constraints, see the PIP_CONSTRAINT workaround below” so readers understand the
relationship and when to use each command.
- Line 27: Update the note about the pyxis inline pull fallback to clearly state
the trade-off: explain that passing the NGC URI via
--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>" causes Pyxis
to re-pull the container on every job (increasing network bandwidth and job
startup time), so it should be used only as a last-resort when enroot import
fails (e.g., permission errors on Lustre); contrast this with enroot import
which creates a reusable .sqsh image with a one-time download and is preferred
for performance and bandwidth savings.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fa4984e7-7ccd-4437-88b6-e2c726d6bae7

📥 Commits

Reviewing files that changed from the base of the PR and between d2c6659 and dbf725c.

📒 Files selected for processing (2)
  • .claude/skills/ptq/references/slurm-setup-ptq.md
  • .claude/skills/ptq/references/unsupported-models.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • .claude/skills/ptq/references/unsupported-models.md

@mxinO mxinO merged commit 82cf851 into main Apr 11, 2026
37 checks passed
@mxinO mxinO deleted the mxin/skill-evolve-1 branch April 11, 2026 03:52
kinjalpatel27 pushed a commit that referenced this pull request Apr 13, 2026
### What does this PR do?

Type of change: Improve <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->

<p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom:
0.2em; color: rgb(97, 97, 97); font-family: -apple-system,
&quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto, sans-serif;
font-size: 13px; font-style: normal; font-variant-ligatures: normal;
font-variant-caps: normal; font-weight: 400; letter-spacing: normal;
orphans: 2; text-align: start; text-indent: 0px; text-transform: none;
widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;
background-color: rgb(242, 242, 242); text-decoration-thickness:
initial; text-decoration-style: initial; text-decoration-color:
initial;"><strong>Summary:</strong></p><ul style="padding-inline-start:
2em; color: rgb(97, 97, 97); font-family: -apple-system,
&quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto, sans-serif;
font-size: 13px; font-style: normal; font-variant-ligatures: normal;
font-variant-caps: normal; font-weight: 400; letter-spacing: normal;
orphans: 2; text-align: start; text-indent: 0px; text-transform: none;
widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;
white-space: normal; background-color: rgb(242, 242, 242);
text-decoration-thickness: initial; text-decoration-style: initial;
text-decoration-color: initial;"><li>Update MoE Pattern 2 for
transformers 5.0 unified fused experts (<code style="font-family:
monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0,
0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word;
font-size:
0.9em;">_QuantFusedExperts</code><span> </span>auto-detection)</li><li>Add<span> </span><code
style="font-family: monospace; color: rgb(163, 21, 21);
background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius:
3px; word-break: break-word; font-size:
0.9em;">PIP_CONSTRAINT</code><span> </span>workaround
and<span> </span><code style="font-family: monospace; color: rgb(163,
21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px;
border-radius: 3px; word-break: break-word; font-size:
0.9em;">PYTHONPATH</code><span> </span>guidance for NGC
containers</li><li>Add pip error diagnostic tip (<code
style="font-family: monospace; color: rgb(163, 21, 21);
background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius:
3px; word-break: break-word; font-size:
0.9em;">ResolutionImpossible</code><span> </span>≠ network
failure)</li><li>Remove duplicated warnings across files — single source
of truth per topic</li></ul><p style="white-space: pre-wrap; margin-top:
0.1em; margin-bottom: 0.2em; color: rgb(97, 97, 97); font-family:
-apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto,
sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures:
normal; font-variant-caps: normal; font-weight: 400; letter-spacing:
normal; orphans: 2; text-align: start; text-indent: 0px; text-transform:
none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;
background-color: rgb(242, 242, 242); text-decoration-thickness:
initial; text-decoration-style: initial; text-decoration-color:
initial;"><strong>Changes by file:</strong></p>

<br class="Apple-interchange-newline">

File | Change
-- | --
references/slurm-setup-ptq.md | Container dependency section: PYTHONPATH
preferred, PIP_CONSTRAINT workaround, --no-deps fallback
references/unsupported-models.md | MoE Pattern 2 updated for
transformers 5.0 auto-detection. Pip install advice points to
slurm-setup-ptq.md. Pip error diagnostic added
SKILL.md | Common Pitfalls simplified — warnings point to references
instead of duplicating

### Usage


### Testing
Tested on gemma4 dense and MoE models.

### Before your PR is "*Ready for review*"

Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)
and your commits are signed (`git commit -s -S`).

Make sure you read and follow the [Security Best
Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors)
(e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(...,
weights_only=False)`, `pickle`, etc.).

- Is this change backward compatible?: ✅ / ❌ / N/A <!--- If ❌, explain
why. -->
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A
<!--- Mandatory -->
- Did you write any new necessary tests?: ✅ / ❌ / N/A <!--- Mandatory
for new features or examples. -->
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
✅ / ❌ / N/A <!--- Only for new features, API changes, critical bug fixes
or backward incompatible changes. -->

### Additional Information
<!-- E.g. related issue. -->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Clarified Transformers-version checks (prefer config.json) and warned
container upgrades can be blocked by PIP_CONSTRAINT; added pointer to
remediation.
* Shortened Docker/NFS guidance by cross-referencing setup docs instead
of explicit commands.
* Reworked SLURM/container workflow to prefer existing images and add an
import → pull fallback.
* Added in-job dependency remediation steps and clarified MoE
auto-detection differences and pip conflict troubleshooting.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Meng Xin <mxin@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants