Skip to content

ADD FG-CLIP2#41886

Open
binwang777 wants to merge 4 commits into
huggingface:mainfrom
binwang777:fgclip2
Open

ADD FG-CLIP2#41886
binwang777 wants to merge 4 commits into
huggingface:mainfrom
binwang777:fgclip2

Conversation

@binwang777

Copy link
Copy Markdown

What does this PR do?

FG-CLIP2 is a new generation of text-image cross-modal model excels in fine-grained discrimination and embedding. It is the foundation model for fine-grained vision-language understanding in both English and Chinese.
Across 29 datasets and 8 diverse tasks, it consistently surpasses recent strong baselines such as SigLIP 2 and MetaCLIP 2, achieving the best reported performance to date in both languages.

Merge the model from https://github.com/binwang777/transformers/tree/fgclip2

@github-actions

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, fgclip2

@Rocketknight1

Copy link
Copy Markdown
Member

Benchmark results look impressive, yes! cc @molbap @yonigozlan @NielsRogge maybe, if any of you want to handle this?

@molbap

molbap commented Nov 5, 2025

Copy link
Copy Markdown
Collaborator

Taking a look!

@molbap molbap self-requested a review November 5, 2025 08:01
@binwang777

binwang777 commented Feb 11, 2026

Copy link
Copy Markdown
Author

@molbap @Rocketknight1 Hello, may I ask how much longer the code review will take? Could you please give us some suggestions? Thank you for your time. This is our GitHub link: (https://github.com/360CVGroup/FG-CLIP)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants