Missing MXFP4 quantized models for Qwen3-235B-A22B, MiniMax-M2.5, and GLM-5

## Summary

Several high-priority frontier MoE models are missing from the AMD Quark MXFP4 quantized models collection on HuggingFace (`amd/` org). NVIDIA has already published NVFP4 versions for all of these, creating a benchmark gap.

## Missing Models

| Model | HuggingFace ID | Params | Active | NVIDIA NVFP4 Available? |
|-------|---------------|--------|--------|------------------------|
| **Qwen3-235B-A22B** | `Qwen/Qwen3-235B-A22B` | 235B | 22B | ✅ `nvidia/Qwen3-235B-A22B-NVFP4` |
| **MiniMax-M2.5** | `MiniMaxAI/MiniMax-M2.5` | 229B | 10B | ✅ `nvidia/MiniMax-M2.5-NVFP4` |
| **GLM-5** | `zai-org/GLM-5` | 745B | 44B | ✅ `nvidia/GLM-5-NVFP4` |

## Existing AMD MXFP4 Models (for reference)

The following models already have MXFP4 versions:
- `amd/DeepSeek-R1-0528-MXFP4` ✅
- `amd/Kimi-K2.5-MXFP4` ✅
- `amd/GLM-5-MXFP4` ✅ (exists but has weight shape issues in ATOM — see below)
- `amd/Qwen3.5-397B-A17B-MXFP4` ✅
- `amd/Qwen3-Coder-Next-MXFP4` ✅

## Why This Matters

1. **Benchmark parity**: Without MXFP4 versions, MI355X benchmarks must use FP8 (2x larger, 2x less compute density), making MI355X vs B300 comparisons unfair
2. **InferenceMAX/InferenceX**: SemiAnalysis benchmarks use NVFP4 on NVIDIA side — AMD needs equivalent MXFP4 for apples-to-apples comparison
3. **Customer readiness**: These are top-5 deployed open MoE models as of March 2026

## Additional Issue: GLM-5 MXFP4 Weight Shape Bug

`amd/GLM-5-MXFP4` exists on HuggingFace but fails to load in ATOM with:
```
RuntimeError: The size of tensor a (3072) must match the size of tensor b (12288) at non-singleton dimension 1
```
This occurs in the weight loader during MXFP4 unpacking with TP=8. The GLM-5 `hidden_size=6144` divided by TP=8 gives 768, but the MXFP4 packed tensor dimension is 12288 (expecting undivided size). Please verify the quantization output format.

## Request

1. Publish `amd/Qwen3-235B-A22B-MXFP4` (or Instruct variant)
2. Publish `amd/MiniMax-M2.5-MXFP4`
3. Fix and re-publish `amd/GLM-5-MXFP4` (weight shape issue)

cc @quark-team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing MXFP4 quantized models for Qwen3-235B-A22B, MiniMax-M2.5, and GLM-5 #24

Summary

Missing Models

Existing AMD MXFP4 Models (for reference)

Why This Matters

Additional Issue: GLM-5 MXFP4 Weight Shape Bug

Request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	HuggingFace ID	Params	Active	NVIDIA NVFP4 Available?
Qwen3-235B-A22B	`Qwen/Qwen3-235B-A22B`	235B	22B	✅ `nvidia/Qwen3-235B-A22B-NVFP4`
MiniMax-M2.5	`MiniMaxAI/MiniMax-M2.5`	229B	10B	✅ `nvidia/MiniMax-M2.5-NVFP4`
GLM-5	`zai-org/GLM-5`	745B	44B	✅ `nvidia/GLM-5-NVFP4`

Missing MXFP4 quantized models for Qwen3-235B-A22B, MiniMax-M2.5, and GLM-5 #24

Description

Summary

Missing Models

Existing AMD MXFP4 Models (for reference)

Why This Matters

Additional Issue: GLM-5 MXFP4 Weight Shape Bug

Request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions