HunYuan opensource by yjc9696 · Pull Request #39606 · huggingface/transformers

yjc9696 · 2025-07-23T13:18:37Z

What does this PR do?

Fixes # (issue)
This PR primarily aims to add support for the Hunyuan series of models in inference. We noticed that previous Hunyuan models relied on trust_remote_code for inference, which makes version maintenance difficult and often leads to outdated inference code. To address this, we are integrating the inference code into the Transformers library to support continuous updates for future open-source releases.

The submitted code includes the inference implementations for both hunyuan_v1_dense and hunyuan_v1_moe, along with their corresponding configurations and tokenizers.

For unit testing, we added a single-sample test for the hunyuan_v1_moe model using tencent/Hunyuan-A13B-Instruct. Unfortunately, the hunyuan_v1_dense model is not yet officially open-sourced, so we currently lack a testable model for it，we will update upon model release.
This is my first PR submission. After carefully studying the Contribute to 🤗 Transformers guide, I've modified my code to pass all make fixup checks.

I'd greatly appreciate any feedback if additional changes or improvements are needed - please don't hesitate to point them out!

Before submitting

[Y] Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
[Y] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

yjc9696 · 2025-07-24T07:09:09Z

how can I solve this problem?

Squash merge branch 'ready_for_upstream' into 'main' * fix configuration type&docstring * fix style

Squash merge branch 'ready_for_upstream' into 'main' * add doc * fix testcode * fix configuration type&docstring

Rocketknight1 · 2025-07-24T12:51:36Z

cc @ArthurZucker for new text models, but let me know if you want me or someone else to take the initial review!

yjc9696 · 2025-07-24T16:28:03Z

cc @ArthurZucker for new text models, but let me know if you want me or someone else to take the initial review!

Thank you for your attention and response. As this is my first submission, I'm not entirely certain which experts I should approach for code review. Would you be able to offer some suggestions or guidance on this matter?

fix usable_length API

update

* update * fix format * update * revert makefile

ArthurZucker

Hey! the main point of guidance is to properly isolate the differences in architecture with your model and other models supported in the library! This way you can use inheritance to write the modeling code with: https://huggingface.co/docs/transformers/en/modular_transformers !
🤗

ArthurZucker · 2025-07-28T11:37:56Z

Otherwise very cool to see this contribution! 🤗

ArthurZucker

Last nits on the MoE, it's pretty standard now so it is important to isolate what you are adding / what is new! From what I can see it might be the token capacity ?
If not then let's just use what we already have standardize, know passes compile / potentially TP etc!

ArthurZucker · 2025-08-19T07:23:47Z

+
+
+def topkgating(logits: Tensor, topk: int):
+    if topk == 1:


isn't top_k always 1?

ArthurZucker · 2025-08-19T07:24:46Z

+        """Implements Top1Gating on logits."""
+        # everything is in fp32 in this function
+        logits = logits.float()
+        gates = F.softmax(logits, dim=1)


any reason not to use gpt_oss or mixtral gating? should be absolutely equivalent + its now fairly standard!

We also have other implementations with token capacity as well!

Hello！We've adopted a more community-standard MoE implementation by referencing other open-source models. This solution delivers identical performance to the original one, and we hope it can effectively resolve the current issue.

ArthurZucker · 2025-08-19T07:27:20Z

+        chunks = dispatched_input.chunk(self.num_experts, dim=0)
+        expert_outputs = []
+        for chunk, expert in zip(chunks, self.experts):
+            expert_outputs.append(expert(chunk))
+
+        expert_output = torch.cat(expert_outputs, dim=0)
+        # combined_output = torch.einsum("sec,ecm->sm", combine_weights.type_as(hidden_states), expert_output)
+        combine_exp = combine_weights.type_as(hidden_states).unsqueeze(3)  # (s, e, c, 1)
+        expert_exp = expert_output.unsqueeze(0)  # (1, e, c, m)
+        combined_output = (combine_exp * expert_exp).sum(dim=(1, 2))  # (s, m)


well, this is rigurously equivalent to the approach we have in llama4 with scattering that uses the dispatch index, I don't think there is a difference with llama4 here no? let's standardize please

Hello！We've adopted a more community-standard MoE implementation by referencing other open-source models. This solution delivers identical performance to the original one, and we hope it can effectively resolve the current issue.

ArthurZucker · 2025-08-19T07:28:40Z

run-slow: auto, hunyuan_v1_dense, hunyuan_v1_moe

github-actions · 2025-08-19T07:30:15Z

This comment contains run-slow, running the specified jobs:

models: ['models/auto', 'models/hunyuan_v1_dense', 'models/hunyuan_v1_moe']
quantizations: [] ...

yjc9696 · 2025-08-19T09:03:36Z

run-slow: auto, hunyuan_v1_dense, hunyuan_v1_moe

ArthurZucker · 2025-08-20T06:06:38Z

Do you know why you had to do this ? Happy to help fix

Since the tokenizer part is not intended to be included in the open-source code, the approach of using trust_remote_code is adopted.

I can help you convert it no?

https://github.com/huggingface/transformers/blob/7dbc054e2a0c3cafd3ea22db0566db700b3a8cbf/src/transformers/integrations/tiktoken.py#L1-L44

Using this will allow no relying on remote code!

This is a history problem. The script you provided has already been used in our dense series models, but the MoE model was open-sourced before that. As a result, the tiktoken-tokenizer file is placed in the model file directory, so user can use the trust_remote_code approach for compatibility.

Sorry I am not sure I understand, but transformers users (as well as vllm) expect AutoTokenizer to work without having to add trust_remote_code=True when the model is merged with transformers

OK,I know what you meaning, we remove this test case until our model ready for fast tokenizer

We have modified the tokenizer in the model card by removing the trust_remote_code dependency and re-added the corresponding tests.

fix moe & gate

* add norm_topk_prob

* fix&skip test

* skip testcase

ArthurZucker · 2025-08-21T11:59:52Z

run-slow: auto, hunyuan_v1_dense, hunyuan_v1_moe

github-actions · 2025-08-21T12:01:19Z

This comment contains run-slow, running the specified jobs:

models: ['models/auto', 'models/hunyuan_v1_dense', 'models/hunyuan_v1_moe']
quantizations: [] ...

ArthurZucker

Kudos! My last comment in on norm_topk_prob if it is always true, let's hardcode to remove codepathes!

ArthurZucker · 2025-08-21T12:02:07Z

-        bsz, seq_len, hidden_size = hidden_states.shape
+        self.shared_mlp = HunYuanMoEV1MLP(config, layer_idx=layer_idx, is_shared_mlp=True)

+    def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:


a lot better thanks! 🚀

* hardcode norm_topk_prob * fix testcase

yjc9696 · 2025-08-21T16:05:18Z

Kudos! My last comment in on norm_topk_prob if it is always true, let's hardcode to remove codepathes!

already fix done~

github-actions · 2025-08-21T16:05:51Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, hunyuan_v1_dense, hunyuan_v1_moe

HuggingFaceDocBuilderDev · 2025-08-22T08:00:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yjc9696 changed the title ~~Hunyuan opensource~~ HunYuan opensource Jul 23, 2025

pridejcyang and others added 8 commits July 24, 2025 17:17

merge opensource_hunyuan

68e75b0

add head_dim

4706f8c

fix assertion error

a27541c

fix seen_tokens

c9b7bd2

ready_for_upstream (merge request !17)

bbaf3c9

Squash merge branch 'ready_for_upstream' into 'main' * fix configuration type&docstring * fix style

ready_for_upstream (merge request !18)

488016d

Squash merge branch 'ready_for_upstream' into 'main' * add doc * fix testcode * fix configuration type&docstring

rename base model

5bbe0a7

remove assert

70711e5

yjc9696 force-pushed the hunyuan_opensource branch from 319d9a1 to 70711e5 Compare July 24, 2025 09:17

Merge branch 'main' into hunyuan_opensource

23ce627

yjc9696 and others added 3 commits July 25, 2025 12:09

Merge branch 'main' into hunyuan_opensource

8060d53

update

d6cb209

remove tiktoken

aed87b5

yjc9696 force-pushed the hunyuan_opensource branch from f244201 to aed87b5 Compare July 25, 2025 09:55

yjc9696 and others added 6 commits July 28, 2025 11:37

Merge pull request #1 from yjc9696/hunyuan_mingji_fix

4c20519

fix usable_length API

update

3baf483

Merge pull request #2 from yjc9696/hunyuan_opensource_mingji_fix_args

880f31e

update

Merge branch 'main' into hunyuan_opensource

c473ade

fix moe and code style (#3)

30a77c9

* update * fix format * update * revert makefile

fix moe config

e9450fc

ArthurZucker reviewed Jul 28, 2025

View reviewed changes

ArthurZucker added the New model label Jul 28, 2025

yjc9696 added 3 commits July 29, 2025 11:57

fix numel()

ff40997

remove prepare_inputs_for_generation

22b9c5e

fix kv_seq_len

07f228b

fix modular

62bce08

ArthurZucker approved these changes Aug 19, 2025

View reviewed changes

fix testcode

1df2109

ArthurZucker reviewed Aug 20, 2025

View reviewed changes

yjc9696 and others added 5 commits August 20, 2025 19:25

remove A13B unit test

3835b22

Fix moe v1 (#9)

6e9eaab

fix moe & gate

Fix gate norm (#10)

d4f65d4

* add norm_topk_prob

Fix testcase (#11)

33cb202

* fix&skip test

Fix testcase (#12)

df81778

* skip testcase

ArthurZucker approved these changes Aug 21, 2025

View reviewed changes

Fix norm topk (#13)

59775cd

* hardcode norm_topk_prob * fix testcase

ArthurZucker enabled auto-merge (squash) August 22, 2025 07:51

ArthurZucker merged commit cf487cd into huggingface:main Aug 22, 2025

Joshua-Chin mentioned this pull request Aug 22, 2025

Add tokenizer_kwargs argument to the text generation pipeline #40364

Merged

5 tasks

Kingsleyandher mentioned this pull request Nov 14, 2025

add hunyuanv1 dense and moe model linkedin/Liger-Kernel#940

Merged

Uh oh!

Conversation

yjc9696 commented Jul 23, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

yjc9696 commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 commented Jul 24, 2025

Uh oh!

yjc9696 commented Jul 24, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker commented Jul 28, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker commented Aug 19, 2025

Uh oh!

github-actions Bot commented Aug 19, 2025

Uh oh!

yjc9696 commented Aug 19, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker commented Aug 21, 2025

Uh oh!

github-actions Bot commented Aug 21, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yjc9696 commented Aug 21, 2025

Uh oh!

github-actions Bot commented Aug 21, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

yjc9696 commented Jul 24, 2025 •

edited

Loading

ArthurZucker Aug 20, 2025 •

edited

Loading