Skip to content

Revert "[nv] - h200 sglang disagg"#581

Merged
cquil11 merged 1 commit intonv/dsr1-fp8-h200-dynamo-trtllm-260126from
revert-580-nv/h200-sglang-disagg
Jan 27, 2026
Merged

Revert "[nv] - h200 sglang disagg"#581
cquil11 merged 1 commit intonv/dsr1-fp8-h200-dynamo-trtllm-260126from
revert-580-nv/h200-sglang-disagg

Conversation

@cquil11
Copy link
Collaborator

@cquil11 cquil11 commented Jan 27, 2026

Reverts InferenceMAX/InferenceMAX#580

@cquil11 cquil11 requested a review from a team as a code owner January 27, 2026 19:49
@cquil11 cquil11 merged commit 796afbe into nv/dsr1-fp8-h200-dynamo-trtllm-260126 Jan 27, 2026
@cquil11 cquil11 deleted the revert-580-nv/h200-sglang-disagg branch January 27, 2026 19:49
cquil11 added a commit that referenced this pull request Jan 29, 2026
* add h200 srtslurm setup

* fix slurm parameters

* refactor logs

* fix log path

* refactor

* fix slurm output

* fix

* fix

* Add H200 dynamo-trt disaggregated configs for 1k1k and 8k1k

Expand dsr1-fp8-h200-dynamo-trt section with full configuration set:
- 1k1k MTP configs (c4-c512) with CONFIG_FILE references
- 1k1k STP configs (c4-c512) with CONFIG_FILE references
- 8k1k MTP configs (c4-c512) with CONFIG_FILE references
- 8k1k STP configs (c4-c512) with CONFIG_FILE references

All configs reference recipe YAMLs in srt-slurm-trtllm repo under
recipies/trtllm/h200/{1k1k,8k1k}/{mtp,stp}/

* add uv and sqsh file

* Update H200 dynamo-trt container image to nvcr.io#nvidia/ai-dynamo/tensorrtllm-runtime:0.8.0

* Fix H200 CONFIG_FILE references to match corrected recipe filenames

* Fix H200 dp-attn values to match recipe filenames

dep8 = enable_attention_dp: true = dp-attn: true
tep8 = enable_attention_dp: false = dp-attn: false

* Fix H200 model path to use DeepSeek-R1-0528

Update MODEL_PATH from /models/dsr1-fp8 (old DeepSeek-R1) to
/models/DeepSeek-R1-0528 (new version matching nvidia-master.yaml)

* add uv install and image fix

* fix squash file path

* fix srt-slurm repo

* update h200 runner details

* Add perf-changelog entry for dsr1-fp8-h200-dynamo-trt config

* Add H200 sglang disagg configs from srtslurm (#580)

- Add dsr1-fp8-h200-dynamo-sglang config to nvidia-master.yaml
- Include 1k1k configs: aggregated, low-latency (1P9D), high-throughput TEP/DEP (1P6D)
- Include 8k1k configs: aggregated, TEP variants (1P7D, 1P6D, 1P3D, 2P3D), DEP (1P1D)
- Add perf-changelog entry for new configuration
- Document recipe registration process in AGENT.md

* Revert "Add H200 sglang disagg configs from srtslurm (#580)" (#581)

This reverts commit f6609d9.

* recipies -> recipes

* Update dsr1-fp8-h200-dynamo-trt image to 0.8.1.post1

* Add dynamic container mapping for srtslurm.yaml

- Update SQUASH_FILE to use /data/containers/ with + separators
- Strip nvcr.io/ prefix from path to match actual .sqsh filenames
- Add CONTAINER_KEY to convert IMAGE to srt-slurm format (nvcr.io#)
- Map container key to .sqsh path dynamically in srtslurm.yaml

* Pin srt-slurm to sa-submission-q1-2026 branch

Use the release branch for Q1 2026 submission instead of main.

* Add srt-slurm GitHub URLs above h200 CONFIG_FILE entries

Link each CONFIG_FILE to its source in srt-slurm sa-submission-q1-2026 branch.

* Update perf-changelog.yaml to modify DSR1 configurations

Removed outdated DSR1 FP8 H200 Dynamo TRT configuration details and re-added them in a new section.

---------

Co-authored-by: Sahithi Chigurupati <schigurupati@nvidia.com>
Co-authored-by: Sahithi Chigurupati <chigurupati.sahithi@gmail.com>
Co-authored-by: Cameron Quilici <cjquilici@gmail.com>
Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant