Enable granite 4 hybrid integration tests#39222
Conversation
|
[For maintainers] Suggested jobs to run (before merge) run-slow: granitemoehybrid |
|
Thank you a lot. I will check the tests in our runner. |
|
Do you know why It's just forward and should give deterministic outputs no? |
|
@ydshieh I agree, but this doesn't seem to be the case, and it seems to diverge a lot more when it's run with bf16. Please let me investigate a bit - I'll follow up once I at least have a better understanding of the root cause of the non-determinism 🙂 |
|
It's known bf16 have less precision (to have wilder range). I'm happy to change it to fp32 (if it fits into A10). That would be nice if we can figure out why? (Maybe first check if all model's weights are loaded from the model checkpoint - if not we will see some warning from the log. But I think it is correctly loaded otherwise the output would be even non-sense). Thank you for helping 🙏 |
|
Unfortunately, it won't without CPU offload, it's about 30 gb of vram to run it in full precision 😞 This model is pretty much combining elements from bamba and granite moe shared - I'm guessing it's something to do with bamba and hopefully expected since the bamba integration tests use similarly high tolerance for these checks, but just need to dig a bit to understand since I'm less familiar with SSMs in general 😅 I'll follow up as soon as possible, and also see if anything can be done about getting a smaller model for testing! |
Enables granite moe hybrid integration tests using the tiny preview model. Tolerance is adjusted to be more lenient for bfloat16.
Fixes # #38542
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@ydshieh can you please take a look?