Conversation
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
|
@claude please fix this, seems like typo in |
|
Claude finished @functionstackx's task in 52s —— View job Fixed YAML Syntax Error in perf-changelog.yaml ✓
Issue: Missing space after dash at line 329 - -config-keys:
- - dsr1-fp8-gb200-dynamo-trt
+ - config-keys:
+ - dsr1-fp8-gb200-dynamo-trtFix pushed: 9cb5200 --- | Branch |
Add missing space after dash in config-keys entry for dsr1-fp8-gb200-dynamo-trt. '-config-keys:' should be '- config-keys:' Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
|
@claude please fix this too in the nvidia-masteer.yaml. seems like just an simple typo of model prefix. change to just dsr1 instead of dsr1-fp8 error log |
|
Claude finished @functionstackx's task in 47s —— View job Fixed Model Prefix Typo in nvidia-master.yaml
Issue: Invalid model-prefix at line 2297 - model-prefix: dsr1-fp8
+ model-prefix: dsr1The runner script only supports Fix pushed: e5efea3 --- | | Branch |
The model-prefix 'dsr1-fp8' is not a supported prefix. The supported prefixes are 'gptoss' and 'dsr1'. Changed to 'dsr1' to fix the launch_gb200-nv.sh runner error. Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
|
@jthomson04 can u take a look at this error related to the path of the model? https://github.com/InferenceMAX/InferenceMAX/actions/runs/21609573902/job/62274835308?pr=617 |
Claude broke it. I have a fix locally, will push soon. |
Signed-off-by: jthomson04 <jothomson@nvidia.com>
mb lol |
|
@jthomson04 claude not AGI confirmed |
|
@claude please make the perf changelog description and PR description slightly more detailed. don't list every single thing per se, but give a high level of the scenarios added. be more detailed in the PR description (make a table of scenarios, something like https://github.com/InferenceMAX/InferenceMAX/pull/582 description) |
|
@claude in .github/configs/nvidia-master.yaml, there are scenarios with corresponding links to external configs that contain more detailed information. can you please get the configs from the links and double check that the fields (such as prefill/decode ep/tp, num workers, etc) that are also present in this repo's config match? |
|
Claude finished @cquil11's task in 1m 29s —— View job Updated PR and Changelog Descriptions ✓
Changes made:
Commit: 05cde10 --- | Branch |
|
Claude finished @cquil11's task in 2m 34s —— View job Verified External Config Links ✓
Result: All 38 external configs match the local values. I fetched each external config from
Verified configs (click to expand)1k1k MTP configs (7):
1k1k STP configs (7):
1k8k MTP configs (5):
1k8k STP configs (5):
8k1k MTP configs (7):
8k1k STP configs (7):
--- | Branch |
Co-authored-by: Cameron Quilici <cquil11@users.noreply.github.com>
cquil11
left a comment
There was a problem hiding this comment.
lgtm once https://github.com/InferenceMAX/InferenceMAX/actions/runs/21610819786 passes
ty!
Signed-off-by: jthomson04 <jothomson@nvidia.com>
|
looking good to merge |
Summary
Add DeepSeek R1 FP8 GB200 Dynamo TRT-LLM disaggregated multinode configurations using srt-slurm recipe-based workflow.
Changes
dsr1-fp8-gb200-dynamo-trtconfig to nvidia-master.yamldsr1model prefix inlaunch_gb200-nv.shConfig Details
Image:
nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:0.8.1.post2Runner: GB200 NVL72 (multinode, disaggregated)
Model: DeepSeek-R1-0528 (FP8)
Total: 38 scenarios
Architecture Variants