[AMD]: fix AITER flags for vllm v0.14.0 docker image by Rohan138 · Pull Request #535 · SemiAnalysisAI/InferenceX

Rohan138 · 2026-01-23T00:19:49Z

#496 updated the MI300/MI325 docker to the recent upstream v0.14.0 release, without updating the corresponding environment variables for AITER. The runs in #496 are using the default vLLM Triton attention kernel instead of the recommended AITER unified attention backend for gpt-oss.

Lower curves on both MI300 and MI325 are on 01/21, as opposed to previous perf.

functionstackx · 2026-01-23T01:06:54Z

thanks @Rohan138

can u add this to https://github.com/InferenceMAX/InferenceMAX/blob/main/perf-changelog.yaml and once ur PR is ready enable the sweep-enabled label to start the sweep?

additionally can u document the recommended flags in https://github.com/vllm-project/recipes/tree/main

Klaud reasonable reads the vllm-project/recipes repo & the release notes to determine what flags to change. If it isn't well documented then, just like the "average" human ROCm user, Klaud unfortunately won't change it.

Rohan138 · 2026-01-23T15:58:31Z

Thanks @functionstackx, can you add the sweep-enabled label/give me perms? We'll document the updates in vllm-recipes as well.

functionstackx · 2026-01-23T20:01:35Z

hmm @cquil11

why did the sweeper bug out?

  File "/home/runner/work/InferenceMAX/InferenceMAX/utils/process_changelog.py", line 141, in <module>
    main()
  File "/home/runner/work/InferenceMAX/InferenceMAX/utils/process_changelog.py", line 107, in main
    result = subprocess.run(
             ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['python3', 'utils/matrix_logic/generate_sweep_configs.py', 'test-config', '--config-keys', 'gptoss-fp4-mi325x-vllm', 'gptoss-fp4-mi300x-vllm', '--config-files', '.github/configs/amd-master.yaml', '.github/configs/nvidia-master.yaml', '--run-evals']' returned non-zero exit status 1.
Error: Process completed with exit code 1.

functionstackx · 2026-01-23T20:51:05Z

@Rohan138 since this is ur ROCm fork, i cant rebase myself. can u rebase again to include this commit? InferenceMAX/InferenceMAX@38546bb

functionstackx · 2026-01-26T04:55:09Z

@cquil11 the github action still doesnt work on forks? should we just manually import this to run the github action?

cquil11 · 2026-01-26T17:45:52Z

/sweep test-config --config-keys gptoss-fp4-mi300x-vllm gptoss-fp4-mi325x-vllm --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml

github-actions · 2026-01-26T17:46:05Z

@cquil11 Kicking off a sweep.

Run: https://github.com/InferenceMAX/InferenceMAX/actions/runs/21367875017
Command: test-config --config-keys gptoss-fp4-mi300x-vllm gptoss-fp4-mi325x-vllm --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml
Pinned ref: fb14b96
Approval: not required (trusted collaborator).

cquil11 · 2026-01-26T17:47:40Z

@functionstackx @Rohan138 the above command accomplishes the same sweep test

functionstackx · 2026-01-26T22:15:34Z

validated pass lgtm https://github.com/InferenceMAX/InferenceMAX/actions/runs/21367875017

@Rohan138 please update vllm-projects/recipes when u get the chance! thanks!

* fix AITER flags for v0.14.0 release * drop mi325 triton gemm env var * Add changes to perf changelog

Rohan138 added 2 commits January 22, 2026 17:58

fix AITER flags for v0.14.0 release

859ba35

drop mi325 triton gemm env var

8c4c6cc

github-project-automation bot added this to InferenceMAX Board Jan 23, 2026

Rohan138 added 2 commits January 23, 2026 09:52

Merge branch 'upstream_main' into fix_gfx942_gptoss_envs

2ef8550

Add changes to perf changelog

be6e6ba

Rohan138 marked this pull request as ready for review January 23, 2026 15:54

Rohan138 requested a review from a team as a code owner January 23, 2026 15:54

functionstackx added p0 AMD sweep-enabled and removed sweep-enabled labels Jan 23, 2026

cquil11 mentioned this pull request Jan 23, 2026

fix: runner model sweep opts error #551

Merged

Merge remote-tracking branch 'upstream/main' into fix_gfx942_gptoss_envs

fb14b96

functionstackx added sweep-enabled and removed sweep-enabled labels Jan 26, 2026

SemiAnalysisAI deleted a comment from github-actions bot Jan 26, 2026

functionstackx merged commit e234f64 into SemiAnalysisAI:main Jan 26, 2026
10 of 196 checks passed

github-project-automation bot moved this to Done in InferenceMAX Board Jan 26, 2026

cquil11 pushed a commit that referenced this pull request Jan 29, 2026

[AMD]: fix AITER flags for vllm v0.14.0 docker image (#535)

f0b8f42

* fix AITER flags for v0.14.0 release * drop mi325 triton gemm env var * Add changes to perf changelog

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD]: fix AITER flags for vllm v0.14.0 docker image#535

[AMD]: fix AITER flags for vllm v0.14.0 docker image#535
functionstackx merged 5 commits intoSemiAnalysisAI:mainfrom
ROCm:fix_gfx942_gptoss_envs

Rohan138 commented Jan 23, 2026

Uh oh!

functionstackx commented Jan 23, 2026 •

edited

Loading

Uh oh!

Rohan138 commented Jan 23, 2026

Uh oh!

functionstackx commented Jan 23, 2026

Uh oh!

functionstackx commented Jan 23, 2026

Uh oh!

functionstackx commented Jan 26, 2026

Uh oh!

cquil11 commented Jan 26, 2026

Uh oh!

github-actions bot commented Jan 26, 2026

Uh oh!

cquil11 commented Jan 26, 2026

Uh oh!

functionstackx commented Jan 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Rohan138 commented Jan 23, 2026

Uh oh!

functionstackx commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rohan138 commented Jan 23, 2026

Uh oh!

functionstackx commented Jan 23, 2026

Uh oh!

functionstackx commented Jan 23, 2026

Uh oh!

functionstackx commented Jan 26, 2026

Uh oh!

cquil11 commented Jan 26, 2026

Uh oh!

github-actions bot commented Jan 26, 2026

Uh oh!

cquil11 commented Jan 26, 2026

Uh oh!

functionstackx commented Jan 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

functionstackx commented Jan 23, 2026 •

edited

Loading