Skip to content

Adding Omnilingual ASR models#43265

Draft
ebezzam wants to merge 13 commits into
huggingface:mainfrom
ebezzam:omnilingual
Draft

Adding Omnilingual ASR models#43265
ebezzam wants to merge 13 commits into
huggingface:mainfrom
ebezzam:omnilingual

Conversation

@ebezzam

@ebezzam ebezzam commented Jan 13, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Adds Omnilingual ASR: https://github.com/facebookresearch/omnilingual-asr

CTC-variant

CUDA_VISIBLE_DEVICES=1 RUN_SLOW=1 pytest tests/models/omniasr/test_modeling_omniasr.py::OmniASRForCTCIntegrationTest

LLM-variant

  • functional conversion to checkpoint that is (more) Transformers-compatible: https://huggingface.co/bezzam/omniasr-llm-300m-v2
  • functional modeling (e.g. by trying with example script)
    • Probably need to define a new processor and tokenizer which can handle the language IDs? And then create a new checkpoint
  • passing single + batch integration tests
CUDA_VISIBLE_DEVICES=1 RUN_SLOW=1 pytest tests/models/omniasr/test_modeling_omniasr.py::OmniASRForConditionalGenerationIntegrationTest

When both CTC and LLM-variants working:

  • (in progress) prune to essential code paths and config (iteratively update conversion script)
  • (in progress) shift into modular file, see what can be reused Wav2Vec2, SeamlessM4T, recent audio LMs
  • custom tokenizer? atm using from LasrTokenizer
  • full tests

@github-actions

Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43265&sha=96f6b6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants