Skip to content
32 changes: 32 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -228,3 +228,35 @@
description:
- "Fix AITER env vars for vLLM v0.14.0 on AMD MI300X and MI325X"
pr-link: https://github.com/InferenceMAX/InferenceMAX/pull/535

- config-keys:
# NVIDIA single-node
- dsr1-fp4-b200-sglang
- dsr1-fp4-b200-trt
- dsr1-fp4-b200-trt-mtp
- dsr1-fp8-b200-sglang
- dsr1-fp8-b200-trt
- dsr1-fp8-b200-trt-mtp
- dsr1-fp8-h200-sglang
- dsr1-fp8-h200-trt
- dsr1-fp8-h200-trt-mtp
- gptoss-fp4-b200-trt
- gptoss-fp4-b200-vllm
- gptoss-fp4-h100-vllm
- gptoss-fp4-h200-trt
- gptoss-fp4-h200-vllm
# AMD single-node
- dsr1-fp4-mi355x-sglang
- dsr1-fp4-mi355x-atom
- dsr1-fp8-mi300x-sglang
- dsr1-fp8-mi325x-sglang
- dsr1-fp8-mi355x-sglang
- dsr1-fp8-mi355x-atom
- gptoss-fp4-mi300x-vllm
- gptoss-fp4-mi325x-vllm
- gptoss-fp4-mi355x-vllm
- gptoss-fp4-mi355x-atom
description:
- Add official GSM8k eval results to GPT-OSS and DeepSeek R1 scenarios
pr-link: https://github.com/InferenceMAX/InferenceMAX/pull/558
evals-only: true
Loading