You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Check for any issues from previous review comments
Provide comprehensive review feedback
Summary
This PR updates GB200 and GB300 SGLANG FP8 DSR1 Disaggregated inference configs for STP mode, including new image version, updated parallelism settings, and CONFIG_FILE-based configuration.
🔴 Critical Issues
1. Missing elif in launch_gb200-nv.sh (Syntax Error)
File:runners/launch_gb200-nv.sh:12-20
The shell script has a missing elif keyword, which will cause a syntax error at runtime:
The first if block is never closed with fi. This will cause the script to fail.
Current code:
if [[ $MODEL_PREFIX=="dsr1"&&$PRECISION=="fp4" ]];then
...
if [[ $MODEL_PREFIX=="dsr1"&&$PRECISION=="fp8" ]];then# This needs to be elif
...
else
...
fi# Only one fi, but two 'if' statements
Fix: Use elif instead of a nested if (as mentioned above), which will properly close the structure with a single fi.
⚠️ Warnings
1. Inconsistent SRT_SLURM_MODEL_PREFIX for dynamo-trt
File:runners/launch_gb200-nv.sh:21-34
The dynamo-trt block still uses MODEL_PREFIX == "dsr1-fp8" (line 28), which is inconsistent with the fix applied to dynamo-sglang. The review feedback from PR #627 suggested using MODEL_PREFIX == "dsr1" && PRECISION == "fp8" pattern to avoid breaking the frontend.
elif [[ $MODEL_PREFIX=="dsr1-fp8" ]];then# ⚠️ Still uses dsr1-fp8export MODEL_PATH="/mnt/numa1/groups/sa-shared/models/deepseek-r1-0528/"export SERVED_MODEL_NAME="deepseek-r1-fp8"
Recommendation: Apply the same pattern to dynamo-trt for consistency:
✅ Proper parallelism settings (TP/EP values match the comments)
✅ dp-attn settings are consistent with low-latency vs high-throughput scenarios
2. perf-changelog.yaml Updates
✅ Properly documents the config changes
✅ Lists both affected config keys
✅ Includes image version and description
3. Model prefix fix for dsr1-fp8-gb200-dynamo-trt
✅ Line 2289: model-prefix: dsr1 (changed from dsr1-fp8) - addresses the frontend compatibility issue
Minor Observations
YAML indentation: Some comments in the search-space have inconsistent indentation (extra leading space before #). This doesn't affect functionality but could be cleaned up for consistency.
Removed code: The old dynamo-sglang-specific job handling code was removed from launch_gb200-nv.sh (lines 55-97 in the old file). This seems intentional as srt-slurm now handles job submission.
Verdict
🔴 Changes Requested - The shell script has a critical syntax error that will prevent execution. Please fix the missing elif before merging.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.