Revert "[nv] - h200 sglang disagg"#581
Merged
cquil11 merged 1 commit intonv/dsr1-fp8-h200-dynamo-trtllm-260126from Jan 27, 2026
Merged
Revert "[nv] - h200 sglang disagg"#581cquil11 merged 1 commit intonv/dsr1-fp8-h200-dynamo-trtllm-260126from
cquil11 merged 1 commit intonv/dsr1-fp8-h200-dynamo-trtllm-260126from
Conversation
This reverts commit f6609d9.
cquil11
added a commit
that referenced
this pull request
Jan 29, 2026
* add h200 srtslurm setup
* fix slurm parameters
* refactor logs
* fix log path
* refactor
* fix slurm output
* fix
* fix
* Add H200 dynamo-trt disaggregated configs for 1k1k and 8k1k
Expand dsr1-fp8-h200-dynamo-trt section with full configuration set:
- 1k1k MTP configs (c4-c512) with CONFIG_FILE references
- 1k1k STP configs (c4-c512) with CONFIG_FILE references
- 8k1k MTP configs (c4-c512) with CONFIG_FILE references
- 8k1k STP configs (c4-c512) with CONFIG_FILE references
All configs reference recipe YAMLs in srt-slurm-trtllm repo under
recipies/trtllm/h200/{1k1k,8k1k}/{mtp,stp}/
* add uv and sqsh file
* Update H200 dynamo-trt container image to nvcr.io#nvidia/ai-dynamo/tensorrtllm-runtime:0.8.0
* Fix H200 CONFIG_FILE references to match corrected recipe filenames
* Fix H200 dp-attn values to match recipe filenames
dep8 = enable_attention_dp: true = dp-attn: true
tep8 = enable_attention_dp: false = dp-attn: false
* Fix H200 model path to use DeepSeek-R1-0528
Update MODEL_PATH from /models/dsr1-fp8 (old DeepSeek-R1) to
/models/DeepSeek-R1-0528 (new version matching nvidia-master.yaml)
* add uv install and image fix
* fix squash file path
* fix srt-slurm repo
* update h200 runner details
* Add perf-changelog entry for dsr1-fp8-h200-dynamo-trt config
* Add H200 sglang disagg configs from srtslurm (#580)
- Add dsr1-fp8-h200-dynamo-sglang config to nvidia-master.yaml
- Include 1k1k configs: aggregated, low-latency (1P9D), high-throughput TEP/DEP (1P6D)
- Include 8k1k configs: aggregated, TEP variants (1P7D, 1P6D, 1P3D, 2P3D), DEP (1P1D)
- Add perf-changelog entry for new configuration
- Document recipe registration process in AGENT.md
* Revert "Add H200 sglang disagg configs from srtslurm (#580)" (#581)
This reverts commit f6609d9.
* recipies -> recipes
* Update dsr1-fp8-h200-dynamo-trt image to 0.8.1.post1
* Add dynamic container mapping for srtslurm.yaml
- Update SQUASH_FILE to use /data/containers/ with + separators
- Strip nvcr.io/ prefix from path to match actual .sqsh filenames
- Add CONTAINER_KEY to convert IMAGE to srt-slurm format (nvcr.io#)
- Map container key to .sqsh path dynamically in srtslurm.yaml
* Pin srt-slurm to sa-submission-q1-2026 branch
Use the release branch for Q1 2026 submission instead of main.
* Add srt-slurm GitHub URLs above h200 CONFIG_FILE entries
Link each CONFIG_FILE to its source in srt-slurm sa-submission-q1-2026 branch.
* Update perf-changelog.yaml to modify DSR1 configurations
Removed outdated DSR1 FP8 H200 Dynamo TRT configuration details and re-added them in a new section.
---------
Co-authored-by: Sahithi Chigurupati <schigurupati@nvidia.com>
Co-authored-by: Sahithi Chigurupati <chigurupati.sahithi@gmail.com>
Co-authored-by: Cameron Quilici <cjquilici@gmail.com>
Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reverts InferenceMAX/InferenceMAX#580