Skip to content

[AMD]: fix AITER flags for vllm v0.14.0 docker image#535

Merged
functionstackx merged 5 commits intoSemiAnalysisAI:mainfrom
ROCm:fix_gfx942_gptoss_envs
Jan 26, 2026
Merged

[AMD]: fix AITER flags for vllm v0.14.0 docker image#535
functionstackx merged 5 commits intoSemiAnalysisAI:mainfrom
ROCm:fix_gfx942_gptoss_envs

Conversation

@Rohan138
Copy link
Contributor

#496 updated the MI300/MI325 docker to the recent upstream v0.14.0 release, without updating the corresponding environment variables for AITER. The runs in #496 are using the default vLLM Triton attention kernel instead of the recommended AITER unified attention backend for gpt-oss.

image Lower curves on both MI300 and MI325 are on 01/21, as opposed to previous perf.

@functionstackx
Copy link
Contributor

functionstackx commented Jan 23, 2026

thanks @Rohan138

can u add this to https://github.com/InferenceMAX/InferenceMAX/blob/main/perf-changelog.yaml and once ur PR is ready enable the sweep-enabled label to start the sweep?

additionally can u document the recommended flags in https://github.com/vllm-project/recipes/tree/main

Klaud reasonable reads the vllm-project/recipes repo & the release notes to determine what flags to change. If it isn't well documented then, just like the "average" human ROCm user, Klaud unfortunately won't change it.

@Rohan138 Rohan138 marked this pull request as ready for review January 23, 2026 15:54
@Rohan138 Rohan138 requested a review from a team as a code owner January 23, 2026 15:54
@Rohan138
Copy link
Contributor Author

Thanks @functionstackx, can you add the sweep-enabled label/give me perms? We'll document the updates in vllm-recipes as well.

@functionstackx
Copy link
Contributor

hmm @cquil11

why did the sweeper bug out?

  File "/home/runner/work/InferenceMAX/InferenceMAX/utils/process_changelog.py", line 141, in <module>
    main()
  File "/home/runner/work/InferenceMAX/InferenceMAX/utils/process_changelog.py", line 107, in main
    result = subprocess.run(
             ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['python3', 'utils/matrix_logic/generate_sweep_configs.py', 'test-config', '--config-keys', 'gptoss-fp4-mi325x-vllm', 'gptoss-fp4-mi300x-vllm', '--config-files', '.github/configs/amd-master.yaml', '.github/configs/nvidia-master.yaml', '--run-evals']' returned non-zero exit status 1.
Error: Process completed with exit code 1.

@functionstackx
Copy link
Contributor

@Rohan138 since this is ur ROCm fork, i cant rebase myself. can u rebase again to include this commit? InferenceMAX/InferenceMAX@38546bb

@functionstackx
Copy link
Contributor

@cquil11 the github action still doesnt work on forks? should we just manually import this to run the github action?

@cquil11
Copy link
Collaborator

cquil11 commented Jan 26, 2026

/sweep test-config --config-keys gptoss-fp4-mi300x-vllm gptoss-fp4-mi325x-vllm --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml

@SemiAnalysisAI SemiAnalysisAI deleted a comment from github-actions bot Jan 26, 2026
@github-actions
Copy link
Contributor

@cquil11 Kicking off a sweep.

Run: https://github.com/InferenceMAX/InferenceMAX/actions/runs/21367875017
Command: test-config --config-keys gptoss-fp4-mi300x-vllm gptoss-fp4-mi325x-vllm --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml
Pinned ref: fb14b96
Approval: not required (trusted collaborator).

@cquil11
Copy link
Collaborator

cquil11 commented Jan 26, 2026

@functionstackx @Rohan138 the above command accomplishes the same sweep test

@functionstackx
Copy link
Contributor

validated pass lgtm https://github.com/InferenceMAX/InferenceMAX/actions/runs/21367875017

@Rohan138 please update vllm-projects/recipes when u get the chance! thanks!

@functionstackx functionstackx merged commit e234f64 into SemiAnalysisAI:main Jan 26, 2026
10 of 196 checks passed
cquil11 pushed a commit that referenced this pull request Jan 29, 2026
* fix AITER flags for v0.14.0 release

* drop mi325 triton gemm env var

* Add changes to perf changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

3 participants