Skip to content

Add DSR1 FP8 H200 Dynamo TRT-LLM configurations#570

Merged
cquil11 merged 35 commits intomainfrom
nv/dsr1-fp8-h200-dynamo-trtllm-260126
Jan 29, 2026
Merged

Add DSR1 FP8 H200 Dynamo TRT-LLM configurations#570
cquil11 merged 35 commits intomainfrom
nv/dsr1-fp8-h200-dynamo-trtllm-260126

Conversation

@nlevin-ui
Copy link
Collaborator

@nlevin-ui nlevin-ui commented Jan 26, 2026

DSR1 FP8 H200 Dynamo TRT-LLM Disagg

  • 1k/1k MTP on/off
  • 8k/1k MTP on/off

csahithi and others added 18 commits January 26, 2026 09:41
Expand dsr1-fp8-h200-dynamo-trt section with full configuration set:
- 1k1k MTP configs (c4-c512) with CONFIG_FILE references
- 1k1k STP configs (c4-c512) with CONFIG_FILE references
- 8k1k MTP configs (c4-c512) with CONFIG_FILE references
- 8k1k STP configs (c4-c512) with CONFIG_FILE references

All configs reference recipe YAMLs in srt-slurm-trtllm repo under
recipies/trtllm/h200/{1k1k,8k1k}/{mtp,stp}/
dep8 = enable_attention_dp: true = dp-attn: true
tep8 = enable_attention_dp: false = dp-attn: false
Update MODEL_PATH from /models/dsr1-fp8 (old DeepSeek-R1) to
/models/DeepSeek-R1-0528 (new version matching nvidia-master.yaml)
@cquil11
Copy link
Collaborator

cquil11 commented Jan 26, 2026

You must update perf-changelog.yaml before the sweep will kick off.

@cquil11
Copy link
Collaborator

cquil11 commented Jan 27, 2026

test sweeping again to smoke test @ishandhanani additions
https://github.com/InferenceMAX/InferenceMAX/actions/runs/21411593275

@ishandhanani
Copy link
Collaborator

ishandhanani commented Jan 27, 2026

This should not have been merged into 1 PR. Sweeps for SGLang won't pass yet.

In the future please allow me to indicate when things are ready to merge into 1 mega PR. It's easier on our end if we can keep things separate

@cquil11
Copy link
Collaborator

cquil11 commented Jan 27, 2026

sure np. nevertheless it appears sweeps (without your changes) are still failing?

edit: is is bc of "recipies" LMFAO (my bad I didn't mean yall actually had to change this 😭 )

cquil11 and others added 6 commits January 27, 2026 13:56
- Update SQUASH_FILE to use /data/containers/ with + separators
- Strip nvcr.io/ prefix from path to match actual .sqsh filenames
- Add CONTAINER_KEY to convert IMAGE to srt-slurm format (nvcr.io#)
- Map container key to .sqsh path dynamically in srtslurm.yaml
Use the release branch for Q1 2026 submission instead of main.
@ishandhanani
Copy link
Collaborator

edit: is is bc of "recipies" LMFAO

LOL we had to fix it at some point. Remnant of me just moving way to fast and not having any spell check in the repo 😆

@functionstackx
Copy link
Contributor

This should not have been merged into 1 PR.

mb guys!

@cquil11
Copy link
Collaborator

cquil11 commented Jan 28, 2026

@nlevin-ui left a couple more comments, mainly nits. In general, please just follow the conventions set in #585 if you can. Also see comments in #510 which may also apply to this PR.

Link each CONFIG_FILE to its source in srt-slurm sa-submission-q1-2026 branch.
Keep all entries:
- H200 dynamo-trt entry (this PR)
- Evals-only entry (PR #558)
- B300 dynamo-trt entry (PR #585)
Removed outdated DSR1 FP8 H200 Dynamo TRT configuration details and re-added them in a new section.
@cquil11
Copy link
Collaborator

cquil11 commented Jan 29, 2026

ok. lgtm, thanks for all your hard work on this.

@cquil11 cquil11 merged commit d7a6d4e into main Jan 29, 2026
7 of 38 checks passed
@cquil11 cquil11 deleted the nv/dsr1-fp8-h200-dynamo-trtllm-260126 branch January 29, 2026 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

6 participants