ICLR25: Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On

Siqi Wan¹, Jingwen Chen², Yingwei Pan², Ting Yao², Tao Mei²

¹University of Science and Technology of China; ²HiDream.ai Inc

This is the official repository for the Paper "Incorporating Visual Correspondence into Diffusion Model for Visual Try-On"

Overview

We novelly propose to explicitly capitalize on visual correspondence as the prior to tame diffusion process instead of simply feeding the whole garment into UNet as the appearance reference.

Installation

Create a conda environment & Install requirments

conda create -n SPM-Diff python==3.9.0
conda activate SPM-Diff
cd SPM-Diff-main 
pip install -r requirements.txt

Semantic Point Matching

In SPM, a set of semantic points on the garment are first sampled and matched to the corresponding points on the target person via local flow warping. Then, these 2D cues are augmented into 3D-aware cues with depth/normal map, which act as semantic point matching to supervise diffusion model.

You can directly download the Semantic Point Feature or follow the instructions in preprocessing.md to extract the Semantic Point Feature yourself.

Dataset

You can download the VITON-HD dataset from here)
For inference, the following dataset structure is required:

test
|-- image
|-- masked_vton_img 
|-- warp-cloth
|-- cloth
|-- cloth_mask
|-- point

Inference

Please download the pre-trained model from Link.

sh inference.sh

Acknowledgement

Thanks the contribution of LaDI-VTON and GP-VTON.

Citation

@inproceedings{
 wan2025incorporating,
 title={Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On},
 author={Siqi Wan and Jingwen Chen and Yingwei Pan and Ting Yao and Tao Mei},
 booktitle={ICLR},
 year={2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
README.md		README.md
dift_util.py		dift_util.py
extract_semantic_point.py		extract_semantic_point.py
image.jpg		image.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ICLR25: Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On

Overview

Installation

Semantic Point Matching

Dataset

Inference

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Languages

HiDream-ai/SPM-Diff

Folders and files

Latest commit

History

Repository files navigation

ICLR25: Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On

Overview

Installation

Semantic Point Matching

Dataset

Inference

Acknowledgement

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages