-
Notifications
You must be signed in to change notification settings - Fork 19
Missing MXFP4 quantized models for Qwen3-235B-A22B, MiniMax-M2.5, and GLM-5 #24
Copy link
Copy link
Open
Description
Summary
Several high-priority frontier MoE models are missing from the AMD Quark MXFP4 quantized models collection on HuggingFace (amd/ org). NVIDIA has already published NVFP4 versions for all of these, creating a benchmark gap.
Missing Models
| Model | HuggingFace ID | Params | Active | NVIDIA NVFP4 Available? |
|---|---|---|---|---|
| Qwen3-235B-A22B | Qwen/Qwen3-235B-A22B |
235B | 22B | ✅ nvidia/Qwen3-235B-A22B-NVFP4 |
| MiniMax-M2.5 | MiniMaxAI/MiniMax-M2.5 |
229B | 10B | ✅ nvidia/MiniMax-M2.5-NVFP4 |
| GLM-5 | zai-org/GLM-5 |
745B | 44B | ✅ nvidia/GLM-5-NVFP4 |
Existing AMD MXFP4 Models (for reference)
The following models already have MXFP4 versions:
amd/DeepSeek-R1-0528-MXFP4✅amd/Kimi-K2.5-MXFP4✅amd/GLM-5-MXFP4✅ (exists but has weight shape issues in ATOM — see below)amd/Qwen3.5-397B-A17B-MXFP4✅amd/Qwen3-Coder-Next-MXFP4✅
Why This Matters
- Benchmark parity: Without MXFP4 versions, MI355X benchmarks must use FP8 (2x larger, 2x less compute density), making MI355X vs B300 comparisons unfair
- InferenceMAX/InferenceX: SemiAnalysis benchmarks use NVFP4 on NVIDIA side — AMD needs equivalent MXFP4 for apples-to-apples comparison
- Customer readiness: These are top-5 deployed open MoE models as of March 2026
Additional Issue: GLM-5 MXFP4 Weight Shape Bug
amd/GLM-5-MXFP4 exists on HuggingFace but fails to load in ATOM with:
RuntimeError: The size of tensor a (3072) must match the size of tensor b (12288) at non-singleton dimension 1
This occurs in the weight loader during MXFP4 unpacking with TP=8. The GLM-5 hidden_size=6144 divided by TP=8 gives 768, but the MXFP4 packed tensor dimension is 12288 (expecting undivided size). Please verify the quantization output format.
Request
- Publish
amd/Qwen3-235B-A22B-MXFP4(or Instruct variant) - Publish
amd/MiniMax-M2.5-MXFP4 - Fix and re-publish
amd/GLM-5-MXFP4(weight shape issue)
cc @quark-team
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels