-
Notifications
You must be signed in to change notification settings - Fork 106
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Optimize Kimi-K2.5-MXFP4 on MI355X: Enable AITER, Expert Parallel, and update to vLLM v0.18.0
#936
opened Mar 24, 2026 by
ChuanLi1101
Loading…
1 task done
[WIP] Enable VLLM_USE_FLASHINFER_MOE_INT4=1 for Kimi K2.5 INT4 B200
NVIDIA
sweep-enabled
#935
opened Mar 23, 2026 by
ankursingh-nv
Loading…
fix: add --exclusive to MI300X SLURM salloc for accurate benchmarks
#930
opened Mar 23, 2026 by
cquil11
Loading…
1 task
Disable prefix cache for kimi vllm configs
sweep-enabled
#926
opened Mar 23, 2026 by
Oseltamivir
Loading…
[WIP] Add Qwen3.5 h200 MTP
NVIDIA
sweep-enabled
#921
opened Mar 20, 2026 by
hshrivastava-droid
Loading…
Separate eval-only workflow and change to 8k1k
sweep-enabled
#911
opened Mar 15, 2026 by
Oseltamivir
Loading…
fix: multi-turn benchmark hangs after all clients finish
#908
opened Mar 13, 2026 by
lishicheng1996-nv
Loading…
3 of 4 tasks
[NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0
sweep-enabled
#904
opened Mar 10, 2026 by
cquil11
Loading…
[WIP] add perf comparison on non-main brances with run-sweep workflow
sweep-enabled
#880
opened Mar 6, 2026 by
cquil11
Loading…
Add Kimi-K2.5 INT4 vLLM v0.16.0 benchmark for MI300X
AMD
sweep-enabled
#860
opened Mar 3, 2026 by
functionstackx
Loading…
Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4)
AMD
sweep-enabled
#827
opened Mar 1, 2026 by
functionstackx
Loading…
[NV] Qwen3.5 B200 SGLang FP4 configs
NVIDIA
sweep-enabled
#820
opened Feb 27, 2026 by
kedarpotdar-nv
Loading…
[NVIDIA] Update NVIDIA single-node DSR1 SGLang images from v0.5.6-v0.5.8 to v0.5.9
image update
NVIDIA
sweep-enabled
#814
opened Feb 26, 2026 by
cquil11
Loading…
Performance Improvements for MI300X with GEMM and FP8 Enhancements
#811
opened Feb 26, 2026 by
chunfangamd
Loading…
[WIP] [NV] Updates SGLang DSR1-FP4 GB300 1k8k configurations (STP only)
NVIDIA
#637
opened Feb 5, 2026 by
yunzhoul-nv
Loading…
[WIP] [NV] Updates SGLang DSR1-FP4 GB200 1k8k (STP only)
NVIDIA
#634
opened Feb 5, 2026 by
yunzhoul-nv
•
Draft
ProTip!
Updated in the last three days: updated:>2026-03-21.