Enable granite 4 hybrid integration tests by alex-jw-brooks · Pull Request #39222 · huggingface/transformers

alex-jw-brooks · 2025-07-04T12:18:30Z

Enables granite moe hybrid integration tests using the tiny preview model. Tolerance is adjusted to be more lenient for bfloat16.

Fixes # #38542

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ydshieh can you please take a look?

github-actions · 2025-07-04T12:19:33Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: granitemoehybrid

ydshieh · 2025-07-04T12:46:13Z

Thank you a lot. I will check the tests in our runner.

ydshieh · 2025-07-04T13:06:00Z

@alex-jw-brooks

Do you know why test_model_logits have different out.logits.float().mean(-1) in different runs?

It's just forward and should give deterministic outputs no?

alex-jw-brooks · 2025-07-07T17:44:08Z

@ydshieh I agree, but this doesn't seem to be the case, and it seems to diverge a lot more when it's run with bf16. Please let me investigate a bit - I'll follow up once I at least have a better understanding of the root cause of the non-determinism 🙂

ydshieh · 2025-07-07T17:53:35Z

It's known bf16 have less precision (to have wilder range). I'm happy to change it to fp32 (if it fits into A10).
The worrying part is more about the non-deterministic outputs in different runs.

That would be nice if we can figure out why?

(Maybe first check if all model's weights are loaded from the model checkpoint - if not we will see some warning from the log. But I think it is correctly loaded otherwise the output would be even non-sense).

Thank you for helping 🙏

alex-jw-brooks · 2025-07-07T18:23:07Z

Unfortunately, it won't without CPU offload, it's about 30 gb of vram to run it in full precision 😞 This model is pretty much combining elements from bamba and granite moe shared - I'm guessing it's something to do with bamba and hopefully expected since the bamba integration tests use similarly high tolerance for these checks, but just need to dig a bit to understand since I'm less familiar with SSMs in general 😅

I'll follow up as soon as possible, and also see if anything can be done about getting a smaller model for testing!

Enable granite 4 hybrid tests

25958c4

evalstate mentioned this pull request Apr 29, 2026

Cumulative feature and defect updates from recent Transformers PRs evalstate/transformers#42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable granite 4 hybrid integration tests#39222

Enable granite 4 hybrid integration tests#39222
alex-jw-brooks wants to merge 1 commit into
huggingface:mainfrom
alex-jw-brooks:enable_granite4_tests

alex-jw-brooks commented Jul 4, 2025

Uh oh!

github-actions Bot commented Jul 4, 2025

Uh oh!

ydshieh commented Jul 4, 2025

Uh oh!

ydshieh commented Jul 4, 2025

Uh oh!

alex-jw-brooks commented Jul 7, 2025 •

edited

Loading

Uh oh!

ydshieh commented Jul 7, 2025

Uh oh!

alex-jw-brooks commented Jul 7, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

alex-jw-brooks commented Jul 4, 2025

Before submitting

Who can review?

Uh oh!

github-actions Bot commented Jul 4, 2025

Uh oh!

ydshieh commented Jul 4, 2025

Uh oh!

ydshieh commented Jul 4, 2025

Uh oh!

alex-jw-brooks commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Jul 7, 2025

Uh oh!

alex-jw-brooks commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alex-jw-brooks commented Jul 7, 2025 •

edited

Loading

alex-jw-brooks commented Jul 7, 2025 •

edited

Loading