Adding [T5/MT5/UMT5]EncoderForSequenceClassification by cbhyphen · Pull Request #40898 · huggingface/transformers

cbhyphen · 2025-09-15T22:21:09Z

What does this PR do?

This PR adds an encoder-only sequence classifier for T5. Inspiration for this comes from the following paper: "Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models". The mean of final hidden states is used as the sentence representation (best results from paper). For t5-small, the encoder-only classifier is nearly half the size and takes nearly a third of the time for a forward pass compared to the encoder-decoder classifier .

Note that I tried to include this new class in MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING_NAMES in modeling_auto.py but I could not get around one failing test test_load_with_mismatched_shapes in test_modeling_common.py. That test seems to invoke the model as a decoder and fails here in the T5Stack class with the following error: ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds. Because of this, I did not add to modeling_auto.py but if there is a need to do so, please let me know (any advice on how-to would be appreciated). Having noted that, this PR does include a small test in test_modeling_t5.py.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ArthurZucker

ArthurZucker

hey! Thanks for the PR. My main question is wether or not there is an actual checkpoint released with this? If not I don't think it really makes sense, but we can do a feature request and leave it up to the community!

You should be able to leverage `
class LlamaForSequenceClassification(GenericForSequenceClassification, LlamaPreTrainedModel):``` as well

cbhyphen · 2025-09-17T06:12:43Z

hey @ArthurZucker thanks for reviewing! There is a fine-tuned TF model (sentence-t5) from the paper here. It's a bit different though and outputs a 768 length sentence embedding vector.

The paper was mostly inspiration for a decent sentence representation, and my main motivation for the PR was to provide a leaner classifier for T5 that can still use FLAN weights. I often fine-tune pre-trained models for text-classification where efficiency is important. This seems to fit the bill for that and I thought others might find it useful. If you prefer the feature request route just let me know how to proceed! thanks.

cbhyphen · 2025-10-09T02:01:20Z

Just submitted a feature request here. Looks like I may also need to add this for mt5 and umt5!

github-actions · 2025-10-10T02:38:35Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: mt5, t5, umt5

cbhyphen · 2025-10-15T04:25:17Z

hey @ArthurZucker there is a feature request out with community support and CI tests are green. please have a look when you have a chance, thanks!

cbhyphen force-pushed the t5-enc-for-seq-clf branch from 4f878f2 to 62bb9c8 Compare September 15, 2025 22:27

ArthurZucker reviewed Sep 16, 2025

View reviewed changes

cbhyphen force-pushed the t5-enc-for-seq-clf branch from 62bb9c8 to 2a23c91 Compare September 17, 2025 05:09

cbhyphen added 5 commits October 9, 2025 14:46

Adding T5EncoderForSequenceClassification

dbf66ff

Add Multilabel

7889ad9

Fix

bef443d

Remove From Modeling Auto

89e595f

Fix ruff

eb9899f

cbhyphen force-pushed the t5-enc-for-seq-clf branch from 2a23c91 to eb9899f Compare October 9, 2025 22:04

cbhyphen added 5 commits October 9, 2025 18:54

updates head mask

f84926b

add to mt5 umt5

87d81d8

add mt5 umt5 tests

31f3399

add mt5 umt5 model doc

5abaa22

style fixes

1874691

fix

75d92a3

cbhyphen changed the title ~~Adding T5EncoderForSequenceClassification~~ Adding [T5/MT5/UMT5]EncoderForSequenceClassification Oct 10, 2025

cbhyphen mentioned this pull request Oct 10, 2025

Support encoder text classification for sequence to sequence models like BART and T5 #41462

Open

evalstate mentioned this pull request Apr 29, 2026

Cumulative feature and defect updates from recent Transformers PRs evalstate/transformers#42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding [T5/MT5/UMT5]EncoderForSequenceClassification#40898

Adding [T5/MT5/UMT5]EncoderForSequenceClassification#40898
cbhyphen wants to merge 11 commits into
huggingface:mainfrom
cbhyphen:t5-enc-for-seq-clf

cbhyphen commented Sep 15, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

cbhyphen commented Sep 17, 2025 •

edited

Loading

Uh oh!

cbhyphen commented Oct 9, 2025 •

edited

Loading

Uh oh!

github-actions Bot commented Oct 10, 2025

Uh oh!

cbhyphen commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

cbhyphen commented Sep 15, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

cbhyphen commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cbhyphen commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Oct 10, 2025

Uh oh!

cbhyphen commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cbhyphen commented Sep 17, 2025 •

edited

Loading

cbhyphen commented Oct 9, 2025 •

edited

Loading