Conversation
📝 WalkthroughWalkthroughThis change adds comprehensive support for the Qwen3.5 model family across the quantization framework, including configuration mappings, quantization exclusions for specific layer projections, documentation updates reflecting support status, and corresponding test utilities and test cases. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unit/torch/quantization/plugins/test_huggingface.py`:
- Around line 286-293: The test currently sets has_gdn_quantized /
has_attn_quantized based only on module name and presence of
module.weight_quantizer, which can give false positives when quantization is
disabled; update the loop over model.named_modules() to also check
module.weight_quantizer.is_enabled (or truthiness of that property) before
setting the flags and ensure the final assertions verify that the found modules
have weight_quantizer.is_enabled true (i.e., assert the quantizer is enabled for
"linear_attn.in_proj_qkv" and "self_attn.q_proj" modules).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b1ad8b23-7859-407c-8231-f295b28c6ba4
📒 Files selected for processing (7)
examples/llm_ptq/README.mdexamples/llm_ptq/example_utils.pyexamples/vlm_ptq/README.mdmodelopt/torch/export/model_utils.pymodelopt/torch/export/quant_utils.pytests/_test_utils/torch/transformers_models.pytests/unit/torch/quantization/plugins/test_huggingface.py
| for name, module in model.named_modules(): | ||
| if hasattr(module, "weight_quantizer") and hasattr(module, "weight"): | ||
| if "linear_attn.in_proj_qkv" in name: | ||
| has_gdn_quantized = True | ||
| if "self_attn.q_proj" in name: | ||
| has_attn_quantized = True | ||
| assert has_gdn_quantized, "GatedDeltaNet linear layers should be quantized" | ||
| assert has_attn_quantized, "Attention linear layers should be quantized" |
There was a problem hiding this comment.
Strengthen quantization assertions to avoid false positives.
The current flags only confirm module-name presence. They should verify weight_quantizer.is_enabled so the test fails when quantization is unexpectedly disabled.
✅ Suggested test fix
- for name, module in model.named_modules():
- if hasattr(module, "weight_quantizer") and hasattr(module, "weight"):
- if "linear_attn.in_proj_qkv" in name:
- has_gdn_quantized = True
- if "self_attn.q_proj" in name:
- has_attn_quantized = True
+ for name, module in model.named_modules():
+ if hasattr(module, "weight_quantizer") and hasattr(module, "weight"):
+ if "linear_attn.in_proj_qkv" in name and module.weight_quantizer.is_enabled:
+ has_gdn_quantized = True
+ if "self_attn.q_proj" in name and module.weight_quantizer.is_enabled:
+ has_attn_quantized = True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/torch/quantization/plugins/test_huggingface.py` around lines 286 -
293, The test currently sets has_gdn_quantized / has_attn_quantized based only
on module name and presence of module.weight_quantizer, which can give false
positives when quantization is disabled; update the loop over
model.named_modules() to also check module.weight_quantizer.is_enabled (or
truthiness of that property) before setting the flags and ensure the final
assertions verify that the found modules have weight_quantizer.is_enabled true
(i.e., assert the quantizer is enabled for "linear_attn.in_proj_qkv" and
"self_attn.q_proj" modules).
What does this PR do?
feature for qwen3.5 quantization
Usage
Testing
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit
New Features
Documentation