Skip to content

Chun reima/sglang fp8 v0.5.8 mi355#572

Merged
functionstackx merged 5 commits intomainfrom
chun_reima/sglang_fp8_v0.5.8_mi355
Jan 27, 2026
Merged

Chun reima/sglang fp8 v0.5.8 mi355#572
functionstackx merged 5 commits intomainfrom
chun_reima/sglang_fp8_v0.5.8_mi355

Conversation

@chunfangamd
Copy link
Collaborator

Update MI355X DeepSeek R1 FP8 SGLang image from v0.5.5.post3 to v0.5.8

Key fix: Disable mla persistent kernel when not using fp8 kv_cache (sgl-project/sglang#17327)

Contributor list:

  • 1am9trash
  • HaiShaw
  • chunfangamd
  • rkarhila-amd

@functionstackx
Copy link
Contributor

@pr-claude review plz

@claude
Copy link
Contributor

claude bot commented Jan 27, 2026

Claude finished @functionstackx's task in 51s —— View job


PR Review Complete

  • Gather context and read changed files
  • Check for CLAUDE.md instructions (none found)
  • Validate master config and perf-changelog requirements
  • Review code changes for bugs/issues
  • Post review summary

Summary: This PR updates the MI355X DeepSeek R1 FP8 SGLang Docker image from v0.5.5.post3 to v0.5.8. The master config change is properly accompanied by a perf-changelog entry.

I left one suggestion: the perf-changelog description should mention the key fix (disabling MLA persistent kernel when not using fp8 kv_cache) as noted in the PR body and requested by @functionstackx in the inline comment.

No blocking issues found.

Add description: Disables mla persistent kernel when not using fp8 kv_cache

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@functionstackx
Copy link
Contributor

e2e tests here already so cancelled sweep-enabled https://github.com/InferenceMAX/InferenceMAX/actions/runs/21399947035

@functionstackx functionstackx merged commit 2a796d1 into main Jan 27, 2026
11 checks passed
@functionstackx functionstackx deleted the chun_reima/sglang_fp8_v0.5.8_mi355 branch January 27, 2026 17:36
cquil11 pushed a commit that referenced this pull request Jan 29, 2026
* Update SGLang Docker Image for MI355 to v0.5.8

1. activate FP8 KV cache
2. use the MLA persistent kernel

* Do not activate FP8 KV cache and the MLA persistent kernel explicitly

* Add config-keys (v0.5.5.post3 --> v0.5.8)

* Update perf-changelog.yaml with key fix description for v0.5.8

Add description: Disables mla persistent kernel when not using fp8 kv_cache

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

3 participants