[2/N] PTQ skill change for transformers 5.0#1229
Conversation
Signed-off-by: Meng Xin <mxin@nvidia.com>
…container pitfalls Signed-off-by: Meng Xin <mxin@nvidia.com>
Signed-off-by: Meng Xin <mxin@nvidia.com>
… per topic Signed-off-by: Meng Xin <mxin@nvidia.com>
Signed-off-by: Meng Xin <mxin@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughPTQ docs revised: SKILL.md Common Pitfalls shortened and cross-referenced; slurm-setup-ptq.md now prefers existing .sqsh, adds enroot import with writable ENROOT paths and a Pyxis inline-pull fallback, and introduces in-job container dependency remediation; unsupported-models.md updates transformers compatibility checks and splits MoE handling by transformers version. Changes
Sequence Diagram(s)sequenceDiagram
participant Job as Job (SLURM)
participant FS as Filesystem (.sqsh)
participant Enroot as Enroot
participant Pyxis as Pyxis (NGC)
participant InJob as In-job (pip / PYTHONPATH)
Job->>FS: check for existing .sqsh (--container-image)
alt .sqsh exists
Job->>Job: launch container from .sqsh
else .sqsh missing
Job->>Enroot: create writable ENROOT_CACHE_PATH/ENROOT_DATA_PATH and run enroot import
alt import succeeds
Enroot-->>Job: container ready
Job->>Job: run job
else import fails (permissions)
Job->>Pyxis: inline pull using NGC URI (--container-image)
Pyxis-->>Job: container available (re-pulled per job)
Job->>Job: run job
end
end
Note over Job,InJob: If container lacks needed deps
Job->>InJob: attempt in-job fixes (install/upgrade transformers, set PYTHONPATH to prefer local sources, editable install (--no-build-isolation), unset PIP_CONSTRAINT, pip --no-deps)
InJob-->>Job: success or surface dependency conflict diagnostics
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Signed-off-by: Meng Xin <mxin@nvidia.com>
Signed-off-by: Meng Xin <mxin@nvidia.com>
There was a problem hiding this comment.
Pull request overview
Updates PTQ skill documentation to align with HuggingFace transformers 5.0 MoE auto-detection, and consolidates container/dependency troubleshooting guidance into shared reference docs for SLURM/container users.
Changes:
- Refreshes MoE Pattern 2 documentation to cover transformers 5.0 unified fused experts auto-detection (
_QuantFusedExperts). - Adds/centralizes container dependency troubleshooting (PYTHONPATH guidance,
PIP_CONSTRAINT/ pip conflict workarounds). - Removes duplicated “pitfalls” guidance by pointing to single-source reference pages.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| .claude/skills/ptq/SKILL.md | Simplifies “Common Pitfalls” and points to reference docs for container upgrade blockers. |
| .claude/skills/ptq/references/unsupported-models.md | Updates MoE Pattern 2 guidance for transformers 5.0+ and adds a pip error diagnostic tip. |
| .claude/skills/ptq/references/slurm-setup-ptq.md | Revises container acquisition steps and adds a dependency pitfalls section (PYTHONPATH, PIP_CONSTRAINT, --no-deps). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1229 +/- ##
==========================================
- Coverage 76.03% 76.03% -0.01%
==========================================
Files 350 350
Lines 40469 40537 +68
==========================================
+ Hits 30772 30822 +50
- Misses 9697 9715 +18
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.claude/skills/ptq/references/slurm-setup-ptq.md (1)
34-35: Consider pinning transformers version when installing from git.Installing from the main branch without a version pin could introduce breaking changes or instability. Consider either:
- Pinning to a specific commit SHA, or
- Adding a note that users should verify compatibility after install
📌 Suggested addition
```bash -pip install git+https://github.com/huggingface/transformers.git --quiet +pip install git+https://github.com/huggingface/transformers.git@<commit-or-tag> --quiet +# Or install main but verify: pip show transformers</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 34 - 35, The
pip install command "pip install
git+https://github.com/huggingface/transformers.git --quiet" should not point to
the mutable main branch; update that invocation to pin to a stable commit or tag
(e.g., append @) or add a nearby note advising users to verify
compatibility after installing from main (e.g., run pip show transformers and
test), so change the installer string and/or add the compatibility note in
slurm-setup-ptq.md.</details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Around line 34-35: The pip install command "pip install
git+https://github.com/huggingface/transformers.git --quiet" should not point to
the mutable main branch; update that invocation to pin to a stable commit or tag
(e.g., append @) or add a nearby note advising users to verify
compatibility after installing from main (e.g., run pip show transformers and
test), so change the installer string and/or add the compatibility note in
slurm-setup-ptq.md.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Path: .coderabbit.yaml **Review profile**: CHILL **Plan**: Pro **Run ID**: `c343acc6-ed06-4ef8-8c86-00189a7f4f40` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 3baa2da62e695e210d00f95ca7f06ac2a69bef71 and 429b4f63543f93aab012ff8cf5c58ebdd1780d37. </details> <details> <summary>📒 Files selected for processing (3)</summary> * `.claude/skills/ptq/SKILL.md` * `.claude/skills/ptq/references/slurm-setup-ptq.md` * `.claude/skills/ptq/references/unsupported-models.md` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Signed-off-by: Meng Xin <mxin@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Around line 31-35: Replace the unpinned VCS pip install line that currently
uses "git+https://github.com/huggingface/transformers.git" with a pinned ref to
a stable tag or commit (e.g., append "@vX.Y.Z" or "@<commit-hash>") so the job
script installs a reproducible transformers version; update the pip invocation
referenced in the Slurm job setup (the pip install git+... line) to use the
chosen tag/commit and document the chosen ref in the job script comments.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 32e4b411-e386-4929-b338-dfb0838033f1
📒 Files selected for processing (2)
.claude/skills/ptq/SKILL.md.claude/skills/ptq/references/slurm-setup-ptq.md
🚧 Files skipped from review as they are similar to previous changes (1)
- .claude/skills/ptq/SKILL.md
|
The AI comments are quite good. Fix them and then good to go. |
Edwardf0t1
left a comment
There was a problem hiding this comment.
LGM in general, left a few comments.
…add CWD note Signed-off-by: Meng Xin <mxin@nvidia.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
.claude/skills/ptq/references/slurm-setup-ptq.md (1)
37-41:⚠️ Potential issue | 🟡 MinorPin the git fallback example to an immutable ref (current example is still unpinned).
The text says to pin, but the example command still installs from a moving target. Please make the sample itself pinned so it’s reproducible.
Suggested doc fix
-If the model requires an unreleased fix not yet on PyPI, fall back to installing from git (pin to a tag or commit when possible): +If the model requires an unreleased fix not yet on PyPI, fall back to installing from git (always pin to a tag or commit): ```bash -pip install -U "git+https://github.com/huggingface/transformers.git" +pip install -U "git+https://github.com/huggingface/transformers.git@<tag-or-commit>"</details> ```web What do pip VCS installation docs and Hugging Face Transformers installation docs recommend for reproducible installs (pinned tag/commit vs tracking main/default branch)?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 37 - 41, The example pip VCS install uses a moving target; update the fallback example that currently shows "git+https://github.com/huggingface/transformers.git" to pin to an immutable ref by appending @<tag-or-commit> (e.g., "git+https://github.com/huggingface/transformers.git@<tag-or-commit>") and update the text to recommend preferring a released pip package or a pinned tag/commit (not tracking main) for reproducible installs, referencing the existing git URL string in the snippet.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Around line 37-41: The example pip VCS install uses a moving target; update
the fallback example that currently shows
"git+https://github.com/huggingface/transformers.git" to pin to an immutable ref
by appending @<tag-or-commit> (e.g.,
"git+https://github.com/huggingface/transformers.git@<tag-or-commit>") and
update the text to recommend preferring a released pip package or a pinned
tag/commit (not tracking main) for reproducible installs, referencing the
existing git URL string in the snippet.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3ac8c68a-f535-4040-93b8-7ea925acee8a
📒 Files selected for processing (2)
.claude/skills/ptq/references/slurm-setup-ptq.md.claude/skills/ptq/references/unsupported-models.md
🚧 Files skipped from review as they are similar to previous changes (1)
- .claude/skills/ptq/references/unsupported-models.md
… tag/commit for git Signed-off-by: Meng Xin <mxin@nvidia.com>
…ull flow in unsupported-models Signed-off-by: Meng Xin <mxin@nvidia.com>
There was a problem hiding this comment.
🧹 Nitpick comments (4)
.claude/skills/ptq/references/slurm-setup-ptq.md (4)
45-45: Clarify which command requires running from the repo root.The phrase "run from the Model-Optimizer repo root" is somewhat ambiguous. Make it explicit that this refers to the
pip install -ecommand in the previous sentence.📝 Suggested clarification
-If `PYTHONPATH` doesn't work due to missing compiled extensions, fall back to `pip install -e ".[hf]" --no-build-isolation` (run from the Model-Optimizer repo root). +If `PYTHONPATH` doesn't work due to missing compiled extensions, fall back to an editable install (from the Model-Optimizer repo root): + +```bash +pip install -e ".[hf]" --no-build-isolation +```🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/ptq/references/slurm-setup-ptq.md at line 45, Clarify that the instruction to "run from the Model-Optimizer repo root" applies specifically to the pip install command; update the sentence so it explicitly states that the pip install -e ".[hf]" --no-build-isolation command must be executed from the Model-Optimizer repository root to ensure editable install finds local package metadata and compiled extensions.
8-27: Consider acknowledging the Docker (non-pyxis) alternative.The container setup section assumes pyxis/enroot availability. Readers on clusters without pyxis/enroot may not realize that
skills/common/slurm-setup.mddocuments a Docker alternative. Adding a brief note pointing to that section would help users quickly identify which pattern applies to their cluster.📝 Suggested addition
## 1. Container +> **Note**: This section assumes pyxis/enroot is available. For clusters using plain `docker run` instead, see `skills/common/slurm-setup.md` section on the Docker (non-pyxis) variant. + Get the recommended image version from `examples/llm_ptq/README.md`, then look for an existing `.sqsh` file:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 8 - 27, Add a brief note in the "1. Container" section acknowledging clusters that lack pyxis/enroot and pointing readers to the Docker alternative documented in skills/common/slurm-setup.md; specifically update the paragraph around the enroot import and pyxis inline-pull guidance (references: enroot import, --container-image, pyxis inline pull) to mention "or use the Docker alternative described in skills/common/slurm-setup.md" so users know which pattern applies to their cluster.
31-52: Consider clarifying the relationship between the two transformers install commands.Lines 34 and 51 both show
pip install -U transformers, but in different contexts (general upgrade vs PIP_CONSTRAINT workaround). Readers might not realize these address different scenarios. Consider adding a cross-reference or clarifying note.📝 Suggested clarification
**New models may need newer transformers** than what's in the container: ```bash pip install -U transformers
+> If this fails with
ResolutionImpossible, see thePIP_CONSTRAINTworkaround below.Or alternatively, at line 47: ```diff -**Watch for pip dependency conflicts** — NGC containers set `PIP_CONSTRAINT` to pin versions, causing `ResolutionImpossible` errors. Unset it first so pip can resolve freely: +**Watch for pip dependency conflicts** — NGC containers set `PIP_CONSTRAINT` to pin versions, causing `ResolutionImpossible` errors when upgrading transformers (above). Unset it first so pip can resolve freely:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/ptq/references/slurm-setup-ptq.md around lines 31 - 52, Clarify that the two occurrences of `pip install -U transformers` address different failure modes: the first (general upgrade) is the normal step to update transformers, and the second (under the PIP_CONSTRAINT section) is shown after unsetting `PIP_CONSTRAINT` to resolve dependency pinning/ResolutionImpossible errors; add a short one-line cross-reference or note near the first `pip install -U transformers` (or next to the PIP_CONSTRAINT section) stating “If this fails with ResolutionImpossible due to pinned constraints, see the PIP_CONSTRAINT workaround below” so readers understand the relationship and when to use each command.
27-27: Clarify the trade-off of the pyxis inline pull fallback.The note mentions re-pulling on every job but doesn't explain the implications clearly. Users should understand this is a last resort because it wastes bandwidth and time, versus a one-time
enroot importthat creates a reusable.sqsh.📝 Suggested clarification
-If enroot import fails (e.g., permission errors on lustre), use pyxis inline pull as fallback — pass the NGC URI directly to `--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>"`. Note this re-pulls on every job. +If enroot import fails (e.g., permission errors on lustre), use pyxis inline pull as fallback — pass the NGC URI directly to `--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>"`. **Note**: this re-pulls the image on every job (wasting bandwidth and startup time), so only use it when `.sqsh` creation is not possible.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/ptq/references/slurm-setup-ptq.md at line 27, Update the note about the pyxis inline pull fallback to clearly state the trade-off: explain that passing the NGC URI via --container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>" causes Pyxis to re-pull the container on every job (increasing network bandwidth and job startup time), so it should be used only as a last-resort when enroot import fails (e.g., permission errors on Lustre); contrast this with enroot import which creates a reusable .sqsh image with a one-time download and is preferred for performance and bandwidth savings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.claude/skills/ptq/references/slurm-setup-ptq.md:
- Line 45: Clarify that the instruction to "run from the Model-Optimizer repo
root" applies specifically to the pip install command; update the sentence so it
explicitly states that the pip install -e ".[hf]" --no-build-isolation command
must be executed from the Model-Optimizer repository root to ensure editable
install finds local package metadata and compiled extensions.
- Around line 8-27: Add a brief note in the "1. Container" section acknowledging
clusters that lack pyxis/enroot and pointing readers to the Docker alternative
documented in skills/common/slurm-setup.md; specifically update the paragraph
around the enroot import and pyxis inline-pull guidance (references: enroot
import, --container-image, pyxis inline pull) to mention "or use the Docker
alternative described in skills/common/slurm-setup.md" so users know which
pattern applies to their cluster.
- Around line 31-52: Clarify that the two occurrences of `pip install -U
transformers` address different failure modes: the first (general upgrade) is
the normal step to update transformers, and the second (under the PIP_CONSTRAINT
section) is shown after unsetting `PIP_CONSTRAINT` to resolve dependency
pinning/ResolutionImpossible errors; add a short one-line cross-reference or
note near the first `pip install -U transformers` (or next to the PIP_CONSTRAINT
section) stating “If this fails with ResolutionImpossible due to pinned
constraints, see the PIP_CONSTRAINT workaround below” so readers understand the
relationship and when to use each command.
- Line 27: Update the note about the pyxis inline pull fallback to clearly state
the trade-off: explain that passing the NGC URI via
--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>" causes Pyxis
to re-pull the container on every job (increasing network bandwidth and job
startup time), so it should be used only as a last-resort when enroot import
fails (e.g., permission errors on Lustre); contrast this with enroot import
which creates a reusable .sqsh image with a one-time download and is preferred
for performance and bandwidth savings.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fa4984e7-7ccd-4437-88b6-e2c726d6bae7
📒 Files selected for processing (2)
.claude/skills/ptq/references/slurm-setup-ptq.md.claude/skills/ptq/references/unsupported-models.md
🚧 Files skipped from review as they are similar to previous changes (1)
- .claude/skills/ptq/references/unsupported-models.md
### What does this PR do? Type of change: Improve <!-- Use one of the following: Bug fix, new feature, new example, new tests, documentation. --> <p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(97, 97, 97); font-family: -apple-system, "system-ui", "Segoe UI", Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(242, 242, 242); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><strong>Summary:</strong></p><ul style="padding-inline-start: 2em; color: rgb(97, 97, 97); font-family: -apple-system, "system-ui", "Segoe UI", Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(242, 242, 242); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li>Update MoE Pattern 2 for transformers 5.0 unified fused experts (<code style="font-family: monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">_QuantFusedExperts</code><span> </span>auto-detection)</li><li>Add<span> </span><code style="font-family: monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">PIP_CONSTRAINT</code><span> </span>workaround and<span> </span><code style="font-family: monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">PYTHONPATH</code><span> </span>guidance for NGC containers</li><li>Add pip error diagnostic tip (<code style="font-family: monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">ResolutionImpossible</code><span> </span>≠ network failure)</li><li>Remove duplicated warnings across files — single source of truth per topic</li></ul><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(97, 97, 97); font-family: -apple-system, "system-ui", "Segoe UI", Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(242, 242, 242); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><strong>Changes by file:</strong></p> <br class="Apple-interchange-newline"> File | Change -- | -- references/slurm-setup-ptq.md | Container dependency section: PYTHONPATH preferred, PIP_CONSTRAINT workaround, --no-deps fallback references/unsupported-models.md | MoE Pattern 2 updated for transformers 5.0 auto-detection. Pip install advice points to slurm-setup-ptq.md. Pip error diagnostic added SKILL.md | Common Pitfalls simplified — warnings point to references instead of duplicating ### Usage ### Testing Tested on gemma4 dense and MoE models. ### Before your PR is "*Ready for review*" Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md) and your commits are signed (`git commit -s -S`). Make sure you read and follow the [Security Best Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors) (e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(..., weights_only=False)`, `pickle`, etc.). - Is this change backward compatible?: ✅ / ❌ / N/A <!--- If ❌, explain why. --> - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A <!--- Mandatory --> - Did you write any new necessary tests?: ✅ / ❌ / N/A <!--- Mandatory for new features or examples. --> - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: ✅ / ❌ / N/A <!--- Only for new features, API changes, critical bug fixes or backward incompatible changes. --> ### Additional Information <!-- E.g. related issue. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Clarified Transformers-version checks (prefer config.json) and warned container upgrades can be blocked by PIP_CONSTRAINT; added pointer to remediation. * Shortened Docker/NFS guidance by cross-referencing setup docs instead of explicit commands. * Reworked SLURM/container workflow to prefer existing images and add an import → pull fallback. * Added in-job dependency remediation steps and clarified MoE auto-detection differences and pip conflict troubleshooting. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Meng Xin <mxin@nvidia.com>
What does this PR do?
Type of change: Improve
Summary:
_QuantFusedExpertsauto-detection)PIP_CONSTRAINTworkaround andPYTHONPATHguidance for NGC containersResolutionImpossible≠ network failure)Changes by file:
Usage
Testing
Tested on gemma4 dense and MoE models.
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit