[2/N] PTQ skill change for transformers 5.0 by mxinO · Pull Request #1229 · NVIDIA/Model-Optimizer

mxinO · 2026-04-10T07:37:17Z

What does this PR do?

Type of change: Improve

Summary:

Update MoE Pattern 2 for transformers 5.0 unified fused experts (_QuantFusedExperts auto-detection)
Add PIP_CONSTRAINT workaround and PYTHONPATH guidance for NGC containers
Add pip error diagnostic tip (ResolutionImpossible ≠ network failure)
Remove duplicated warnings across files — single source of truth per topic

Changes by file:

File	Change
references/slurm-setup-ptq.md	Container dependency section: PYTHONPATH preferred, PIP_CONSTRAINT workaround, --no-deps fallback
references/unsupported-models.md	MoE Pattern 2 updated for transformers 5.0 auto-detection. Pip install advice points to slurm-setup-ptq.md. Pip error diagnostic added
SKILL.md	Common Pitfalls simplified — warnings point to references instead of duplicating

Usage

Testing

Tested on gemma4 dense and MoE models.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

Documentation
- Clarified Transformers-version checks (prefer config.json) and warned container upgrades can be blocked by PIP_CONSTRAINT; added pointer to remediation.
- Shortened Docker/NFS guidance by cross-referencing setup docs instead of explicit commands.
- Reworked SLURM/container workflow to prefer existing images and add an import → pull fallback.
- Added in-job dependency remediation steps and clarified MoE auto-detection differences and pip conflict troubleshooting.

Signed-off-by: Meng Xin <mxin@nvidia.com>

…container pitfalls Signed-off-by: Meng Xin <mxin@nvidia.com>

Signed-off-by: Meng Xin <mxin@nvidia.com>

… per topic Signed-off-by: Meng Xin <mxin@nvidia.com>

Signed-off-by: Meng Xin <mxin@nvidia.com>

copy-pr-bot · 2026-04-10T07:37:23Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-04-10T07:37:26Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

PTQ docs revised: SKILL.md Common Pitfalls shortened and cross-referenced; slurm-setup-ptq.md now prefers existing .sqsh, adds enroot import with writable ENROOT paths and a Pyxis inline-pull fallback, and introduces in-job container dependency remediation; unsupported-models.md updates transformers compatibility checks and splits MoE handling by transformers version.

Changes

Cohort / File(s)	Summary
SKILL guidance `.claude/skills/ptq/SKILL.md`	Simplified "Common Pitfalls": removed example-driven ModelOpt/transformers install steps; instruct to check `config.json` for `transformers_version`; warn that `PIP_CONSTRAINT` can block container upgrades; add cross-reference to PTQ SLURM guide.
SLURM container workflow `.claude/skills/ptq/references/slurm-setup-ptq.md`	Make `.sqsh` usage deterministic (`--container-image`) and skip import if present; otherwise create writable `ENROOT_CACHE_PATH`/`ENROOT_DATA_PATH` and run `enroot import`; add Pyxis inline-pull fallback when import fails; introduce "Container dependency pitfalls" with in-job remediation steps (upgrade/install `transformers`, prefer synced ModelOpt via `PYTHONPATH`, editable `pip install -e ".[hf]" --no-build-isolation`, unset `PIP_CONSTRAINT`, use `pip --no-deps`).
Unsupported models & MoE handling `.claude/skills/ptq/references/unsupported-models.md`	Replace ModelOpt-first install guidance with a clearer "check transformers compatibility" flow (`pip install -U transformers` then try `AutoConfig.from_pretrained()`); if still failing, install `transformers` main branch; split MoE auto-detection for `transformers >= 5.0` vs `< 5.0`; remove `_QuantQwen35MoeExperts` example; add pip `ResolutionImpossible` diagnostics note.

Sequence Diagram(s)

sequenceDiagram
    participant Job as Job (SLURM)
    participant FS as Filesystem (.sqsh)
    participant Enroot as Enroot
    participant Pyxis as Pyxis (NGC)
    participant InJob as In-job (pip / PYTHONPATH)

    Job->>FS: check for existing .sqsh (--container-image)
    alt .sqsh exists
        Job->>Job: launch container from .sqsh
    else .sqsh missing
        Job->>Enroot: create writable ENROOT_CACHE_PATH/ENROOT_DATA_PATH and run enroot import
        alt import succeeds
            Enroot-->>Job: container ready
            Job->>Job: run job
        else import fails (permissions)
            Job->>Pyxis: inline pull using NGC URI (--container-image)
            Pyxis-->>Job: container available (re-pulled per job)
            Job->>Job: run job
        end
    end

    Note over Job,InJob: If container lacks needed deps
    Job->>InJob: attempt in-job fixes (install/upgrade transformers, set PYTHONPATH to prefer local sources, editable install (--no-build-isolation), unset PIP_CONSTRAINT, pip --no-deps)
    InJob-->>Job: success or surface dependency conflict diagnostics

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title '[2/N] PTQ skill change for transformers 5.0' accurately summarizes the main changes: updates to PTQ documentation and detection logic to support transformers 5.0, including MoE auto-detection, container setup, and error handling guidance.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Anti-Patterns	✅ Passed	PR contains only documentation changes to markdown files in .claude/skills/ptq/ directory. No Python source code or configuration files were modified, so security check is not applicable.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch mxin/skill-evolve-1

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-10T07:41:23Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-11 03:53 UTC

Signed-off-by: Meng Xin <mxin@nvidia.com>

Copilot

Pull request overview

Updates PTQ skill documentation to align with HuggingFace transformers 5.0 MoE auto-detection, and consolidates container/dependency troubleshooting guidance into shared reference docs for SLURM/container users.

Changes:

Refreshes MoE Pattern 2 documentation to cover transformers 5.0 unified fused experts auto-detection (_QuantFusedExperts).
Adds/centralizes container dependency troubleshooting (PYTHONPATH guidance, PIP_CONSTRAINT / pip conflict workarounds).
Removes duplicated “pitfalls” guidance by pointing to single-source reference pages.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
.claude/skills/ptq/SKILL.md	Simplifies “Common Pitfalls” and points to reference docs for container upgrade blockers.
.claude/skills/ptq/references/unsupported-models.md	Updates MoE Pattern 2 guidance for transformers 5.0+ and adds a pip error diagnostic tip.
.claude/skills/ptq/references/slurm-setup-ptq.md	Revises container acquisition steps and adds a dependency pitfalls section (PYTHONPATH, `PIP_CONSTRAINT`, `--no-deps`).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.claude/skills/ptq/references/unsupported-models.md

.claude/skills/ptq/references/slurm-setup-ptq.md

.claude/skills/ptq/SKILL.md

codecov · 2026-04-10T08:02:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.03%. Comparing base (3baa2da) to head (dbf725c).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1229      +/-   ##
==========================================
- Coverage   76.03%   76.03%   -0.01%     
==========================================
  Files         350      350              
  Lines       40469    40537      +68     
==========================================
+ Hits        30772    30822      +50     
- Misses       9697     9715      +18

Flag	Coverage Δ
unit	`55.53% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

🧹 Nitpick comments (1)

.claude/skills/ptq/references/slurm-setup-ptq.md (1)
34-35: Consider pinning transformers version when installing from git.

Installing from the main branch without a version pin could introduce breaking changes or instability. Consider either:

Pinning to a specific commit SHA, or

Adding a note that users should verify compatibility after install
📌 Suggested addition
 ```bash
-pip install git+https://github.com/huggingface/transformers.git --quiet
+pip install git+https://github.com/huggingface/transformers.git@<commit-or-tag> --quiet
+# Or install main but verify: pip show transformers
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 34 - 35, The
pip install command "pip install
git+https://github.com/huggingface/transformers.git --quiet" should not point to
the mutable main branch; update that invocation to pin to a stable commit or tag
(e.g., append @) or add a nearby note advising users to verify
compatibility after installing from main (e.g., run pip show transformers and
test), so change the installer string and/or add the compatibility note in
slurm-setup-ptq.md.
</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:

Around line 34-35: The pip install command "pip install
git+https://github.com/huggingface/transformers.git --quiet" should not point to
the mutable main branch; update that invocation to pin to a stable commit or tag
(e.g., append @) or add a nearby note advising users to verify
compatibility after installing from main (e.g., run pip show transformers and
test), so change the installer string and/or add the compatibility note in
slurm-setup-ptq.md.
</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `c343acc6-ed06-4ef8-8c86-00189a7f4f40`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 3baa2da62e695e210d00f95ca7f06ac2a69bef71 and 429b4f63543f93aab012ff8cf5c58ebdd1780d37.

</details>

<details>
<summary>📒 Files selected for processing (3)</summary>

* `.claude/skills/ptq/SKILL.md`
* `.claude/skills/ptq/references/slurm-setup-ptq.md`
* `.claude/skills/ptq/references/unsupported-models.md`

</details>

</details>

Signed-off-by: Meng Xin <mxin@nvidia.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Around line 31-35: Replace the unpinned VCS pip install line that currently
uses "git+https://github.com/huggingface/transformers.git" with a pinned ref to
a stable tag or commit (e.g., append "@vX.Y.Z" or "@<commit-hash>") so the job
script installs a reproducible transformers version; update the pip invocation
referenced in the Slurm job setup (the pip install git+... line) to use the
chosen tag/commit and document the chosen ref in the job script comments.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 32e4b411-e386-4929-b338-dfb0838033f1

📥 Commits

Reviewing files that changed from the base of the PR and between 429b4f6 and ea0c193.

📒 Files selected for processing (2)

.claude/skills/ptq/SKILL.md
.claude/skills/ptq/references/slurm-setup-ptq.md

🚧 Files skipped from review as they are similar to previous changes (1)

.claude/skills/ptq/SKILL.md

.claude/skills/ptq/references/slurm-setup-ptq.md

shengliangxu · 2026-04-10T15:31:18Z

The AI comments are quite good. Fix them and then good to go.

Edwardf0t1

LGM in general, left a few comments.

.claude/skills/ptq/references/unsupported-models.md

.claude/skills/ptq/references/slurm-setup-ptq.md

…add CWD note Signed-off-by: Meng Xin <mxin@nvidia.com>

coderabbitai

♻️ Duplicate comments (1)

.claude/skills/ptq/references/slurm-setup-ptq.md (1)

37-41: ⚠️ Potential issue | 🟡 Minor

Pin the git fallback example to an immutable ref (current example is still unpinned).

The text says to pin, but the example command still installs from a moving target. Please make the sample itself pinned so it’s reproducible.

Suggested doc fix

-If the model requires an unreleased fix not yet on PyPI, fall back to installing from git (pin to a tag or commit when possible):
+If the model requires an unreleased fix not yet on PyPI, fall back to installing from git (always pin to a tag or commit):
 
 ```bash
-pip install -U "git+https://github.com/huggingface/transformers.git"
+pip install -U "git+https://github.com/huggingface/transformers.git@<tag-or-commit>"

</details>

  

```web
What do pip VCS installation docs and Hugging Face Transformers installation docs recommend for reproducible installs (pinned tag/commit vs tracking main/default branch)?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 37 - 41, The
example pip VCS install uses a moving target; update the fallback example that
currently shows "git+https://github.com/huggingface/transformers.git" to pin to
an immutable ref by appending @<tag-or-commit> (e.g.,
"git+https://github.com/huggingface/transformers.git@<tag-or-commit>") and
update the text to recommend preferring a released pip package or a pinned
tag/commit (not tracking main) for reproducible installs, referencing the
existing git URL string in the snippet.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Around line 37-41: The example pip VCS install uses a moving target; update
the fallback example that currently shows
"git+https://github.com/huggingface/transformers.git" to pin to an immutable ref
by appending @<tag-or-commit> (e.g.,
"git+https://github.com/huggingface/transformers.git@<tag-or-commit>") and
update the text to recommend preferring a released pip package or a pinned
tag/commit (not tracking main) for reproducible installs, referencing the
existing git URL string in the snippet.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3ac8c68a-f535-4040-93b8-7ea925acee8a

📥 Commits

Reviewing files that changed from the base of the PR and between ea0c193 and 25166ae.

📒 Files selected for processing (2)

.claude/skills/ptq/references/slurm-setup-ptq.md
.claude/skills/ptq/references/unsupported-models.md

🚧 Files skipped from review as they are similar to previous changes (1)

.claude/skills/ptq/references/unsupported-models.md

… tag/commit for git Signed-off-by: Meng Xin <mxin@nvidia.com>

…ull flow in unsupported-models Signed-off-by: Meng Xin <mxin@nvidia.com>

coderabbitai

🧹 Nitpick comments (4)

.claude/skills/ptq/references/slurm-setup-ptq.md (4)

45-45: Clarify which command requires running from the repo root.

The phrase "run from the Model-Optimizer repo root" is somewhat ambiguous. Make it explicit that this refers to the pip install -e command in the previous sentence.

📝 Suggested clarification

-If `PYTHONPATH` doesn't work due to missing compiled extensions, fall back to `pip install -e ".[hf]" --no-build-isolation` (run from the Model-Optimizer repo root).
+If `PYTHONPATH` doesn't work due to missing compiled extensions, fall back to an editable install (from the Model-Optimizer repo root):
+
+```bash
+pip install -e ".[hf]" --no-build-isolation
+```

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md at line 45, Clarify that
the instruction to "run from the Model-Optimizer repo root" applies specifically
to the pip install command; update the sentence so it explicitly states that the
pip install -e ".[hf]" --no-build-isolation command must be executed from the
Model-Optimizer repository root to ensure editable install finds local package
metadata and compiled extensions.

8-27: Consider acknowledging the Docker (non-pyxis) alternative.

The container setup section assumes pyxis/enroot availability. Readers on clusters without pyxis/enroot may not realize that skills/common/slurm-setup.md documents a Docker alternative. Adding a brief note pointing to that section would help users quickly identify which pattern applies to their cluster.

📝 Suggested addition

 ## 1. Container
 
+> **Note**: This section assumes pyxis/enroot is available. For clusters using plain `docker run` instead, see `skills/common/slurm-setup.md` section on the Docker (non-pyxis) variant.
+
 Get the recommended image version from `examples/llm_ptq/README.md`, then look for an existing `.sqsh` file:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 8 - 27, Add a
brief note in the "1. Container" section acknowledging clusters that lack
pyxis/enroot and pointing readers to the Docker alternative documented in
skills/common/slurm-setup.md; specifically update the paragraph around the
enroot import and pyxis inline-pull guidance (references: enroot import,
--container-image, pyxis inline pull) to mention "or use the Docker alternative
described in skills/common/slurm-setup.md" so users know which pattern applies
to their cluster.

31-52: Consider clarifying the relationship between the two transformers install commands.

Lines 34 and 51 both show pip install -U transformers, but in different contexts (general upgrade vs PIP_CONSTRAINT workaround). Readers might not realize these address different scenarios. Consider adding a cross-reference or clarifying note.

📝 Suggested clarification

 **New models may need newer transformers** than what's in the container:
 
 ```bash
 pip install -U transformers

+> If this fails with ResolutionImpossible, see the PIP_CONSTRAINT workaround below.


Or alternatively, at line 47:

```diff
-**Watch for pip dependency conflicts** — NGC containers set `PIP_CONSTRAINT` to pin versions, causing `ResolutionImpossible` errors. Unset it first so pip can resolve freely:
+**Watch for pip dependency conflicts** — NGC containers set `PIP_CONSTRAINT` to pin versions, causing `ResolutionImpossible` errors when upgrading transformers (above). Unset it first so pip can resolve freely:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 31 - 52,
Clarify that the two occurrences of `pip install -U transformers` address
different failure modes: the first (general upgrade) is the normal step to
update transformers, and the second (under the PIP_CONSTRAINT section) is shown
after unsetting `PIP_CONSTRAINT` to resolve dependency
pinning/ResolutionImpossible errors; add a short one-line cross-reference or
note near the first `pip install -U transformers` (or next to the PIP_CONSTRAINT
section) stating “If this fails with ResolutionImpossible due to pinned
constraints, see the PIP_CONSTRAINT workaround below” so readers understand the
relationship and when to use each command.

27-27: Clarify the trade-off of the pyxis inline pull fallback.

The note mentions re-pulling on every job but doesn't explain the implications clearly. Users should understand this is a last resort because it wastes bandwidth and time, versus a one-time enroot import that creates a reusable .sqsh.

📝 Suggested clarification

-If enroot import fails (e.g., permission errors on lustre), use pyxis inline pull as fallback — pass the NGC URI directly to `--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>"`. Note this re-pulls on every job.
+If enroot import fails (e.g., permission errors on lustre), use pyxis inline pull as fallback — pass the NGC URI directly to `--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>"`. **Note**: this re-pulls the image on every job (wasting bandwidth and startup time), so only use it when `.sqsh` creation is not possible.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ptq/references/slurm-setup-ptq.md at line 27, Update the note
about the pyxis inline pull fallback to clearly state the trade-off: explain
that passing the NGC URI via
--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>" causes Pyxis
to re-pull the container on every job (increasing network bandwidth and job
startup time), so it should be used only as a last-resort when enroot import
fails (e.g., permission errors on Lustre); contrast this with enroot import
which creates a reusable .sqsh image with a one-time download and is preferred
for performance and bandwidth savings.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Line 45: Clarify that the instruction to "run from the Model-Optimizer repo
root" applies specifically to the pip install command; update the sentence so it
explicitly states that the pip install -e ".[hf]" --no-build-isolation command
must be executed from the Model-Optimizer repository root to ensure editable
install finds local package metadata and compiled extensions.
- Around line 8-27: Add a brief note in the "1. Container" section acknowledging
clusters that lack pyxis/enroot and pointing readers to the Docker alternative
documented in skills/common/slurm-setup.md; specifically update the paragraph
around the enroot import and pyxis inline-pull guidance (references: enroot
import, --container-image, pyxis inline pull) to mention "or use the Docker
alternative described in skills/common/slurm-setup.md" so users know which
pattern applies to their cluster.
- Around line 31-52: Clarify that the two occurrences of `pip install -U
transformers` address different failure modes: the first (general upgrade) is
the normal step to update transformers, and the second (under the PIP_CONSTRAINT
section) is shown after unsetting `PIP_CONSTRAINT` to resolve dependency
pinning/ResolutionImpossible errors; add a short one-line cross-reference or
note near the first `pip install -U transformers` (or next to the PIP_CONSTRAINT
section) stating “If this fails with ResolutionImpossible due to pinned
constraints, see the PIP_CONSTRAINT workaround below” so readers understand the
relationship and when to use each command.
- Line 27: Update the note about the pyxis inline pull fallback to clearly state
the trade-off: explain that passing the NGC URI via
--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>" causes Pyxis
to re-pull the container on every job (increasing network bandwidth and job
startup time), so it should be used only as a last-resort when enroot import
fails (e.g., permission errors on Lustre); contrast this with enroot import
which creates a reusable .sqsh image with a one-time download and is preferred
for performance and bandwidth savings.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fa4984e7-7ccd-4437-88b6-e2c726d6bae7

📥 Commits

Reviewing files that changed from the base of the PR and between d2c6659 and dbf725c.

📒 Files selected for processing (2)

.claude/skills/ptq/references/slurm-setup-ptq.md
.claude/skills/ptq/references/unsupported-models.md

🚧 Files skipped from review as they are similar to previous changes (1)

.claude/skills/ptq/references/unsupported-models.md

### What does this PR do? Type of change: Improve  <p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(97, 97, 97); font-family: -apple-system, "system-ui", "Segoe UI", Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(242, 242, 242); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Summary:<ul style="padding-inline-start: 2em; color: rgb(97, 97, 97); font-family: -apple-system, "system-ui", "Segoe UI", Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(242, 242, 242); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li>Update MoE Pattern 2 for transformers 5.0 unified fused experts (<code style="font-family: monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">_QuantFusedExperts</code> auto-detection)</li><li>Add <code style="font-family: monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">PIP_CONSTRAINT</code> workaround and <code style="font-family: monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">PYTHONPATH</code> guidance for NGC containers</li><li>Add pip error diagnostic tip (<code style="font-family: monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">ResolutionImpossible</code> ≠ network failure)</li><li>Remove duplicated warnings across files — single source of truth per topic</li></ul><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(97, 97, 97); font-family: -apple-system, "system-ui", "Segoe UI", Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(242, 242, 242); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Changes by file: File | Change -- | -- references/slurm-setup-ptq.md | Container dependency section: PYTHONPATH preferred, PIP_CONSTRAINT workaround, --no-deps fallback references/unsupported-models.md | MoE Pattern 2 updated for transformers 5.0 auto-detection. Pip install advice points to slurm-setup-ptq.md. Pip error diagnostic added SKILL.md | Common Pitfalls simplified — warnings point to references instead of duplicating ### Usage ### Testing Tested on gemma4 dense and MoE models. ### Before your PR is "*Ready for review*" Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md) and your commits are signed (`git commit -s -S`). Make sure you read and follow the [Security Best Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors) (e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(..., weights_only=False)`, `pickle`, etc.). - Is this change backward compatible?: ✅ / ❌ / N/A  - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A  - Did you write any new necessary tests?: ✅ / ❌ / N/A  - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: ✅ / ❌ / N/A  ### Additional Information   ## Summary by CodeRabbit * **Documentation** * Clarified Transformers-version checks (prefer config.json) and warned container upgrades can be blocked by PIP_CONSTRAINT; added pointer to remediation. * Shortened Docker/NFS guidance by cross-referencing setup docs instead of explicit commands. * Reworked SLURM/container workflow to prefer existing images and add an import → pull fallback. * Added in-job dependency remediation steps and clarified MoE auto-detection differences and pip conflict troubleshooting.  --------- Signed-off-by: Meng Xin <mxin@nvidia.com>

mxinO added 8 commits April 9, 2026 02:57

Add container and dependency learnings from Gemma-4 E2E test

d0ecff6

Signed-off-by: Meng Xin <mxin@nvidia.com>

Merge remote-tracking branch 'origin/main' into mxin/skill-evolve-1

a0ad092

Merge remote-tracking branch 'origin/main' into mxin/skill-evolve-1

eb30a97

Update MoE Pattern 2 for transformers 5.0 unified fused experts, add …

549b9c9

…container pitfalls Signed-off-by: Meng Xin <mxin@nvidia.com>

Add script to split fused MoE weights for vLLM compatibility

61a184d

Signed-off-by: Meng Xin <mxin@nvidia.com>

Merge remote-tracking branch 'origin/main' into mxin/skill-evolve-1

64a10dd

Remove duplicated warnings across skill files, single source of truth…

42d4554

… per topic Signed-off-by: Meng Xin <mxin@nvidia.com>

Remove split script from branch, not intended for publish

43c83d9

Signed-off-by: Meng Xin <mxin@nvidia.com>

mxinO added 2 commits April 10, 2026 07:42

Clarify gated datasets warning

bb8dce6

Signed-off-by: Meng Xin <mxin@nvidia.com>

Fix trailing newline (markdownlint)

429b4f6

Signed-off-by: Meng Xin <mxin@nvidia.com>

mxinO marked this pull request as ready for review April 10, 2026 07:54

mxinO requested review from Edwardf0t1, Copilot and kaix-nv April 10, 2026 07:54

Copilot started reviewing on behalf of mxinO April 10, 2026 07:55 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

Address PR comments: add -U to pip install, fix grammar

ea0c193

Signed-off-by: Meng Xin <mxin@nvidia.com>

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

.claude/skills/ptq/references/slurm-setup-ptq.md Outdated Show resolved Hide resolved

mxinO requested a review from shengliangxu April 10, 2026 08:56

coderabbitai bot mentioned this pull request Apr 10, 2026

📝 CodeRabbit Chat: Implement requested code changes #1232

Closed

shengliangxu reviewed Apr 10, 2026

View reviewed changes

.claude/skills/ptq/references/slurm-setup-ptq.md Show resolved Hide resolved

Edwardf0t1 approved these changes Apr 10, 2026

View reviewed changes

.claude/skills/ptq/references/unsupported-models.md Outdated Show resolved Hide resolved

.claude/skills/ptq/references/slurm-setup-ptq.md Outdated Show resolved Hide resolved

Address PR comments: PyPI first for transformers, fix relative path, …

25166ae

…add CWD note Signed-off-by: Meng Xin <mxin@nvidia.com>

coderabbitai bot reviewed Apr 11, 2026

View reviewed changes

mxinO added 2 commits April 11, 2026 01:39

Check model's required transformers version before installing, pin to…

d2c6659

… tag/commit for git Signed-off-by: Meng Xin <mxin@nvidia.com>

Unify transformers version handling: simple upgrade in slurm-setup, f…

dbf725c

…ull flow in unsupported-models Signed-off-by: Meng Xin <mxin@nvidia.com>

coderabbitai bot reviewed Apr 11, 2026

View reviewed changes

mxinO merged commit 82cf851 into main Apr 11, 2026
37 checks passed

mxinO deleted the mxin/skill-evolve-1 branch April 11, 2026 03:52

Conversation

mxinO commented Apr 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Apr 10, 2026

Uh oh!

coderabbitai bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

github-actions bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

shengliangxu commented Apr 10, 2026

Uh oh!

Edwardf0t1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mxinO commented Apr 10, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 10, 2026 •

edited

Loading

github-actions bot commented Apr 10, 2026 •

edited

Loading

codecov bot commented Apr 10, 2026 •

edited

Loading