[nv] - h200 sglang disagg by ishandhanani · Pull Request #580 · SemiAnalysisAI/InferenceX

ishandhanani · 2026-01-27T19:33:05Z

Add H200 SGLang disaggregated multinode configurations, sourced from srtslurm recipes.

Depends on #570

Changes

Add dsr1-fp8-h200-dynamo-sglang config to nvidia-master.yaml
1k1k: aggregated, low-latency (1P9D), high-throughput TEP/DEP (1P6D)
8k1k: aggregated, TEP variants (1P7D, 1P6D, 1P3D, 2P3D), DEP (1P1D)
Add perf-changelog entry
Document recipe registration from srtslurm in AGENT.md

- Add dsr1-fp8-h200-dynamo-sglang config to nvidia-master.yaml - Include 1k1k configs: aggregated, low-latency (1P9D), high-throughput TEP/DEP (1P6D) - Include 8k1k configs: aggregated, TEP variants (1P7D, 1P6D, 1P3D, 2P3D), DEP (1P1D) - Add perf-changelog entry for new configuration - Document recipe registration process in AGENT.md

cquil11

nice nice. lgtm
thx @ishandhanani

will wait for test sweep to pass as smoke test

cquil11

will wait for prerequisite pr to merge as states in desc

cquil11 · 2026-01-27T19:40:39Z

nvm ur merging into prereq branch

ishandhanani · 2026-01-27T19:45:30Z

why merge this one?

cquil11 · 2026-01-27T19:49:20Z

u were targeting https://github.com/InferenceMAX/InferenceMAX/tree/nv/dsr1-fp8-h200-dynamo-trtllm-260126 as your base, so I thought u wanted to merge into that one to merge them to main altogether.

I can revert this one, and u merge into main later?
my bad for misunderstanding

This reverts commit f6609d9.

ishandhanani · 2026-01-27T19:51:45Z

Yea. Lets get Nir's PR in first and then get this one merged in afterwards. The base will change when Nir's goes into main (should be soon).

I will reopen

* add h200 srtslurm setup * fix slurm parameters * refactor logs * fix log path * refactor * fix slurm output * fix * fix * Add H200 dynamo-trt disaggregated configs for 1k1k and 8k1k Expand dsr1-fp8-h200-dynamo-trt section with full configuration set: - 1k1k MTP configs (c4-c512) with CONFIG_FILE references - 1k1k STP configs (c4-c512) with CONFIG_FILE references - 8k1k MTP configs (c4-c512) with CONFIG_FILE references - 8k1k STP configs (c4-c512) with CONFIG_FILE references All configs reference recipe YAMLs in srt-slurm-trtllm repo under recipies/trtllm/h200/{1k1k,8k1k}/{mtp,stp}/ * add uv and sqsh file * Update H200 dynamo-trt container image to nvcr.io#nvidia/ai-dynamo/tensorrtllm-runtime:0.8.0 * Fix H200 CONFIG_FILE references to match corrected recipe filenames * Fix H200 dp-attn values to match recipe filenames dep8 = enable_attention_dp: true = dp-attn: true tep8 = enable_attention_dp: false = dp-attn: false * Fix H200 model path to use DeepSeek-R1-0528 Update MODEL_PATH from /models/dsr1-fp8 (old DeepSeek-R1) to /models/DeepSeek-R1-0528 (new version matching nvidia-master.yaml) * add uv install and image fix * fix squash file path * fix srt-slurm repo * update h200 runner details * Add perf-changelog entry for dsr1-fp8-h200-dynamo-trt config * Add H200 sglang disagg configs from srtslurm (#580) - Add dsr1-fp8-h200-dynamo-sglang config to nvidia-master.yaml - Include 1k1k configs: aggregated, low-latency (1P9D), high-throughput TEP/DEP (1P6D) - Include 8k1k configs: aggregated, TEP variants (1P7D, 1P6D, 1P3D, 2P3D), DEP (1P1D) - Add perf-changelog entry for new configuration - Document recipe registration process in AGENT.md * Revert "Add H200 sglang disagg configs from srtslurm (#580)" (#581) This reverts commit f6609d9. * recipies -> recipes * Update dsr1-fp8-h200-dynamo-trt image to 0.8.1.post1 * Add dynamic container mapping for srtslurm.yaml - Update SQUASH_FILE to use /data/containers/ with + separators - Strip nvcr.io/ prefix from path to match actual .sqsh filenames - Add CONTAINER_KEY to convert IMAGE to srt-slurm format (nvcr.io#) - Map container key to .sqsh path dynamically in srtslurm.yaml * Pin srt-slurm to sa-submission-q1-2026 branch Use the release branch for Q1 2026 submission instead of main. * Add srt-slurm GitHub URLs above h200 CONFIG_FILE entries Link each CONFIG_FILE to its source in srt-slurm sa-submission-q1-2026 branch. * Update perf-changelog.yaml to modify DSR1 configurations Removed outdated DSR1 FP8 H200 Dynamo TRT configuration details and re-added them in a new section. --------- Co-authored-by: Sahithi Chigurupati <schigurupati@nvidia.com> Co-authored-by: Sahithi Chigurupati <chigurupati.sahithi@gmail.com> Co-authored-by: Cameron Quilici <cjquilici@gmail.com> Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com> Co-authored-by: ishandhanani <82981111+ishandhanani@users.noreply.github.com>

ishandhanani requested a review from a team as a code owner January 27, 2026 19:33

github-project-automation bot added this to InferenceMAX Board Jan 27, 2026

ishandhanani changed the base branch from main to nv/dsr1-fp8-h200-dynamo-trtllm-260126 January 27, 2026 19:33

ishandhanani added the NVIDIA label Jan 27, 2026

cquil11 approved these changes Jan 27, 2026

View reviewed changes

cquil11 added the sweep-enabled label Jan 27, 2026

cquil11 requested changes Jan 27, 2026

View reviewed changes

cquil11 merged commit f6609d9 into nv/dsr1-fp8-h200-dynamo-trtllm-260126 Jan 27, 2026

cquil11 deleted the nv/h200-sglang-disagg branch January 27, 2026 19:41

github-project-automation bot moved this to Done in InferenceMAX Board Jan 27, 2026

cquil11 restored the nv/h200-sglang-disagg branch January 27, 2026 19:49

cquil11 added a commit that referenced this pull request Jan 27, 2026

Revert "Add H200 sglang disagg configs from srtslurm (#580)"

6d80d07

This reverts commit f6609d9.

cquil11 mentioned this pull request Jan 27, 2026

Revert "[nv] - h200 sglang disagg" #581

Merged

cquil11 added a commit that referenced this pull request Jan 27, 2026

Revert "Add H200 sglang disagg configs from srtslurm (#580)" (#581)

796afbe

This reverts commit f6609d9.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[nv] - h200 sglang disagg#580

[nv] - h200 sglang disagg#580
cquil11 merged 1 commit intonv/dsr1-fp8-h200-dynamo-trtllm-260126from
nv/h200-sglang-disagg

ishandhanani commented Jan 27, 2026 •

edited

Loading

Uh oh!

cquil11 left a comment •

edited

Loading

Uh oh!

cquil11 left a comment

Uh oh!

cquil11 commented Jan 27, 2026

Uh oh!

ishandhanani commented Jan 27, 2026

Uh oh!

cquil11 commented Jan 27, 2026

Uh oh!

ishandhanani commented Jan 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ishandhanani commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cquil11 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cquil11 left a comment

Choose a reason for hiding this comment

Uh oh!

cquil11 commented Jan 27, 2026

Uh oh!

ishandhanani commented Jan 27, 2026

Uh oh!

cquil11 commented Jan 27, 2026

Uh oh!

ishandhanani commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ishandhanani commented Jan 27, 2026 •

edited

Loading

cquil11 left a comment •

edited

Loading

ishandhanani commented Jan 27, 2026 •

edited

Loading