Skip to content

[NV] dsr1 fp4 b300 dynamo trtllm#585

Merged
cquil11 merged 25 commits intomainfrom
nv/dsr1-fp4-b300-dynamo-trtllm-260122
Jan 28, 2026
Merged

[NV] dsr1 fp4 b300 dynamo trtllm#585
cquil11 merged 25 commits intomainfrom
nv/dsr1-fp4-b300-dynamo-trtllm-260122

Conversation

@cquil11
Copy link
Collaborator

@cquil11 cquil11 commented Jan 27, 2026

csahithi and others added 19 commits January 22, 2026 14:11
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
Signed-off-by: jthomson04 <jothomson@nvidia.com>
@cquil11
Copy link
Collaborator Author

cquil11 commented Jan 27, 2026

@jthomson04 @csahithi can one of y'all please re-approve this? we had to revert the original merge and re-run because the name of the directories on the clusters changed so the runs failed

edit: there seem to be some other unrelated errors having to do with image download? flakes? https://github.com/InferenceMAX/InferenceMAX/actions/runs/21412751024/job/61653699072

cc @kedarpotdar-nv


git clone https://github.com/ishandhanani/srt-slurm.git "$SRT_REPO_DIR"
cd "$SRT_REPO_DIR"
git checkout b4abe4643a7009f3539b36bdc508408874a4c930
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this appears to be possibly causing errors indirectly, since new commits have potentially been made to the main branch which introduced breaking changes?
https://github.com/InferenceMAX/InferenceMAX/actions/runs/21412751024/job/61653699014#step:4:893

@jthomson04
Copy link
Collaborator

Ah, we recently did a big rebase and updated our configs. I'll push a fix shortly

Signed-off-by: jthomson04 <jothomson@nvidia.com>
@jthomson04 jthomson04 force-pushed the nv/dsr1-fp4-b300-dynamo-trtllm-260122 branch from 6b7f7d9 to d4bc5f0 Compare January 27, 2026 22:30
Signed-off-by: jthomson04 <jothomson@nvidia.com>
@cquil11 cquil11 merged commit 4248627 into main Jan 28, 2026
2 checks passed
@cquil11 cquil11 deleted the nv/dsr1-fp4-b300-dynamo-trtllm-260122 branch January 28, 2026 16:27
nlevin-ui added a commit that referenced this pull request Jan 28, 2026
Keep all entries:
- H200 dynamo-trt entry (this PR)
- Evals-only entry (PR #558)
- B300 dynamo-trt entry (PR #585)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

3 participants