T5Gemma by jncraton · Pull Request #1955 · OpenNMT/CTranslate2

jncraton · 2025-12-20T14:05:40Z

Support T5Gemma architecture. Here's the basic idea from the transformers T5Gemma documentation:

T5Gemma (aka encoder-decoder Gemma) was proposed in a research paper by Google. It is a family of encoder-decoder large language models, developed by adapting pretrained decoder-only models into encoder-decoder. T5Gemma includes pretrained and instruction-tuned variants. The architecture is based on transformer encoder-decoder design following T5, with improvements from Gemma 2: GQA, RoPE, GeGLU activation, RMSNorm, and interleaved local/global attention.

For reference, here is the T5Gemma PR that merged model support for these architectures into transformers.

jncraton · 2025-12-20T14:08:24Z

Sorry. I didn't mean to open this PR here yet. It needs a lot of work before it would be ready to go.

Copilot

Pull request overview

This PR adds support for the T5Gemma architecture, a family of encoder-decoder models developed by Google that adapts pretrained decoder-only models into an encoder-decoder architecture. T5Gemma combines T5's encoder-decoder design with Gemma 2 improvements including GQA, RoPE, GeGLU activation, RMSNorm, and pre/post layer normalization patterns.

Key changes:

Introduces pre/post layer norm support for both encoder and decoder transformer layers
Adds cross-attention pre/post layer norms for decoder layers to support T5Gemma's normalization pattern
Implements T5GemmaLoader converter to map T5Gemma models from Hugging Face transformers to CTranslate2

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
src/layers/transformer.cc	Implements pre/post layer norm logic for encoder and decoder layers, including optional cross-attention layer norms
include/ctranslate2/layers/transformer.h	Adds layer norm member variables to support pre/post normalization in encoder and decoder layers
python/ctranslate2/specs/transformer_spec.py	Extends transformer specs to support pre_post_layer_norm configuration parameter and creates appropriate layer norm specs
python/ctranslate2/converters/transformers.py	Adds T5GemmaLoader to convert T5Gemma models, including weight mapping, vocabulary handling, and RoPE configuration
README.md	Updates supported models list to include T5Gemma

Comments suppressed due to low confidence (1)

python/ctranslate2/converters/transformers.py:1347

Variable num_heads_kv_enc is not used.

            num_heads_kv_enc = None

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

BBC-Esq · 2025-12-20T16:14:09Z

Ctranslate2 doesn't currently support the encoder/decoder based embedding models so I'm excited to see this. Have you started working on the Qwen3 embedding models yet by chance? Ctranslate2's c++ and python code would both have to be modified because it currently can't return the "intermediate states" or whatever you call it...Seems like the same issue with t5gemma architecture?

jncraton · 2025-12-20T17:50:25Z

@BBC-Esq Older encoder/decoder models (such as T5) are actually currently supported, but this would add support for one with a more modern architecture. Embedding models can also be used currently. I've used forward_batch and .last_hidden_state to compute document embeddings.

BBC-Esq · 2025-12-20T19:00:56Z

@jncraton can you please teach me how to do that? Maybe you have a sample script because I can convert the Qwen3 embedding models because the architecture is supported,but I can't figure out how to use it with ctranslate2 yet.

BBC-Esq · 2025-12-20T19:36:28Z

The forward_batch / .last_hidden_state approach works for encoder-only models (BERT, XLM-RoBERTa, etc.) loaded via ctranslate2.Encoder, which returns an EncoderForwardOutput object containing last_hidden_state and pooler_output.
However, Qwen3-Embedding is a decoder-only architecture and it gets loaded as a ctranslate2.Generator. The Generator.forward_batch() method returns logits (post-lm_head output), not the hidden states.

This is what my understanding is/was unless you're aware of some kind of workaround? My understanding is that we'd need to modify CTranslate2's C++ inference path to optionally return hidden states from decoder models...

jncraton · 2025-12-20T19:44:03Z

@BBC-Esq That's my understanding as well. I'm not aware of a way to get access to the embeddings from Qwen3 without modifying CTranslate2.

Initial t5gemma implementation

e04da2e

Copilot AI review requested due to automatic review settings December 20, 2025 14:05

Copilot started reviewing on behalf of jncraton December 20, 2025 14:06 View session

jncraton closed this Dec 20, 2025

Copilot AI reviewed Dec 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

T5Gemma#1955

T5Gemma#1955
jncraton wants to merge 1 commit into
OpenNMT:masterfrom
jncraton:t5gemma

jncraton commented Dec 20, 2025

Uh oh!

jncraton commented Dec 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BBC-Esq commented Dec 20, 2025

Uh oh!

jncraton commented Dec 20, 2025

Uh oh!

BBC-Esq commented Dec 20, 2025

Uh oh!

BBC-Esq commented Dec 20, 2025

Uh oh!

jncraton commented Dec 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jncraton commented Dec 20, 2025

Uh oh!

jncraton commented Dec 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BBC-Esq commented Dec 20, 2025

Uh oh!

jncraton commented Dec 20, 2025

Uh oh!

BBC-Esq commented Dec 20, 2025

Uh oh!

BBC-Esq commented Dec 20, 2025

Uh oh!

jncraton commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jncraton commented Dec 20, 2025 •

edited

Loading