NVIDIA / Model-Optimizer Public

Notifications You must be signed in to change notification settings
Fork 342
Star 2.4k

Code
Issues 60
Pull requests 131
Actions
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security and quality
Insights

Pull requests: NVIDIA/Model-Optimizer

Labels 30 Milestones 0

New pull request New

131 Open 758 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add SwinTransformer support for torch_onnx quantization workflow

#1235 opened Apr 10, 2026 by ajrasane Contributor

Loading…

3 tasks

Fix test_collect_hidden_states: use synthetic short conversations

#1234 opened Apr 10, 2026 by yeyu-nvidia Contributor

Loading…

2 tasks

vLLM fakequant: add recipe-based quantization support

#1233 opened Apr 10, 2026 by kinjalpatel27 Contributor

Loading…

support Qwen3.5 quantization

#1230 opened Apr 10, 2026 by deepindeed2022

Loading…

[2/N] PTQ skill change for transformers 5.0

#1229 opened Apr 10, 2026 by mxinO Contributor

Loading…

Pre-initialize torch._dynamo to prevent double-registration with peft torch.compile() call

#1228 opened Apr 9, 2026 by hychiang-git Contributor

Loading…

[2/3] Implicit Gemm NVFP4

#1227 opened Apr 9, 2026 by jingyu-ml Contributor • Draft

Add LTX-2 third-party license notices for legal compliance cherry-pick

After code freeze, cherry-pick into release branch for next rc. Only for bug fixes and doc updates

#1226 opened Apr 9, 2026 by kevalmorabia97 Collaborator

Loading…

3 tasks

added gptq nvfp4 default recipe + docstring fix

#1224 opened Apr 9, 2026 by sugunav14 Contributor

Loading…

1 task done

GPTQ vector

#1223 opened Apr 9, 2026 by sugunav14 Contributor • Draft

consolidate mbridge distillation: merge distill_hf.py into distill.py

#1220 opened Apr 9, 2026 by j-rausch Contributor

Loading…

Add Gemma4 MoE quantization support

#1219 opened Apr 9, 2026 by yueshen2016 Contributor

Loading…

4 tasks done

Add WaterSIC for KV-cache quantization

#1217 opened Apr 9, 2026 by kaix-nv Contributor • Draft

Add TriAttention For KV Cache Compression

#1216 opened Apr 9, 2026 by kaix-nv Contributor • Draft

Add VLM base model support for auto_quantize in hf_ptq

#1214 opened Apr 9, 2026 by yueshen2016 Contributor

Loading…

Add FP8 QKVO + NVFP4 MLP PTQ recipe

#1213 opened Apr 9, 2026 by yueshen2016 Contributor

Loading…

add: DFlash block diffusion speculative decoding

#1211 opened Apr 8, 2026 by ChenhanYu Collaborator

Loading…

149

Add Z-Image (NextDiT/Lumina2) PTQ quantization support in diffusers example

#1205 opened Apr 8, 2026 by andrea-pilzer

Loading…

Add support for postprocess exported model for block scale swizzling and support for different padding strategy

#1195 opened Apr 8, 2026 by ynankani Contributor

Loading…

fix: handle accelerate CPU-offloaded models in FakeQuant export

#1194 opened Apr 8, 2026 by sungsooha Contributor

Loading…

Simplify KDTrainer and enhance ModelOptHFTrainer

#1191 opened Apr 7, 2026 by realAsma Contributor

Loading…

4 of 6 tasks

Add ModelOpt Triton attention kernels for WAN2.2 diffusion (sparse, skip-softmax, NVFP4)

#1190 opened Apr 7, 2026 by yeyu-nvidia Contributor

Loading…

5 tasks

[chore]: weekly bump of uv.lock on main (2026-04-06)

#1180 opened Apr 6, 2026 by github-actions bot

Loading…

GPTQ test

#1179 opened Apr 6, 2026 by sugunav14 Contributor • Draft

feat: parallelize fakequant export across GPUs via ThreadPoolExecutor

#1177 opened Apr 3, 2026 by sungsooha Contributor

Loading…

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!