WIP: Retrieval Augmented Diffusion Models by isamu-isozaki · Pull Request #1846 · huggingface/diffusers

isamu-isozaki · 2022-12-27T21:14:55Z

Pulled code from patil's branch to start making the retriever class and training script for rdm. I'll base this code on

https://github.com/afiaka87/retrieval-augmented-diffusion

and

isamu-isozaki · 2022-12-28T02:21:04Z

I found that huggingface datasets library already has faiss integration. Trying to figure out how to combine with CLIPVisionModel now

isamu-isozaki · 2022-12-28T04:56:57Z

@patil-suraj Hi! I moved to this branch. I started working on the retrieval class. Will keep working/testing on it tomorrow

isamu-isozaki · 2022-12-29T01:47:52Z

I think I almost got the retrieval class working. Will prob finish today and push then work on the training script tomorrow. Will also post example results

isamu-isozaki · 2022-12-29T06:30:48Z

Success! I'll update scripts and push

isamu-isozaki · 2022-12-29T07:05:35Z

This is an example! Currently I embed the oxford pet database with the faiss index and I can use knn to get the 10 nearest neighbors using the text clip.

The code I used for this example can be found in examples/rdm in a jupyter notebook. I'll remove the notebook later on for clean code.

isamu-isozaki · 2022-12-29T07:11:31Z

Next I'll start making a training script

isamu-isozaki · 2022-12-29T22:06:14Z

I'll try using clip-retrieval tool by laion too as an option for training. I'll double check paper on the implementation then I'll go ahead and train

neverix · 2022-12-31T15:13:22Z

I get

ValueError: The component <class 'transformers.models.clip.image_processing_clip.CLIPImageProcessor'>
of <class 'diffusers.pipelines.rdm.pipeline_rdm.RDMPipeline'> cannot be loaded as it does not seem to have
any of the loading methods defined in {'ModelMixin': ['save_pretrained', 'from_pretrained'], 'SchedulerMixin': 
['save_config', 'from_config'], 'DiffusionPipeline': ['save_pretrained', 'from_pretrained'], 'OnnxRuntimeModel': 
['save_pretrained', 'from_pretrained'], 'PreTrainedTokenizer': ['save_pretrained', 'from_pretrained'], 
'PreTrainedTokenizerFast': ['save_pretrained', 'from_pretrained'], 'PreTrainedModel': 
['save_pretrained', 'from_pretrained'], 'FeatureExtractionMixin': ['save_pretrained', 'from_pretrained']}.

with fusing/rdm. What could cause this?

isamu-isozaki · 2022-12-31T15:32:53Z

@neverix
Is this the error you get when doing

from diffusers import RDMPipeline

pipe = RDMPipeline.from_pretrained("fusing/rdm")
pipe.to("cuda")

prompt = "a happy pineapple" 
images = pipe(prompt).images

or

retrieved_images = # a list of PIL images 
images = pipe(prompt, retrieved_images=retrieved_images).images

?

isamu-isozaki · 2022-12-31T15:36:38Z

I'll try to reproduce the problem and let you know. It is pretty weird since the ImageProcessor does not have a load from pretrained method so it's pretty weird trying to load it.

neverix · 2022-12-31T22:12:58Z

@neverix Is this the error you get when doing

from diffusers import RDMPipeline

pipe = RDMPipeline.from_pretrained("fusing/rdm")
pipe.to("cuda")

prompt = "a happy pineapple" 
images = pipe(prompt).images

or

retrieved_images = # a list of PIL images 
images = pipe(prompt, retrieved_images=retrieved_images).images

?

I get it in the model creation step, so there's not much of a difference

isamu-isozaki · 2023-01-01T05:14:39Z

@neverix tnx was able to reproduce. I'll try figuring out a fix tomorrow

isamu-isozaki · 2023-01-01T17:20:35Z

Interesting I got the same result when I did

pipeline = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")

isamu-isozaki · 2023-02-26T15:15:02Z

ok! I made some changes so the retriever can index with a general model if given the argument given moco, simclr, ibot etc. Next, I will wrap out the training/inference

isamu-isozaki · 2023-02-26T15:19:43Z

once that's done I'll fix the checks!

isamu-isozaki · 2023-02-26T15:57:44Z

This is for my personal research but I also will try adding ibot embeddings support too. Honestly I doubt anything will need changing

isamu-isozaki · 2023-02-26T17:18:13Z

I think I'll remove clip-retrieval from the script for now since

Training with this script will end up doing a lot of requests to their web service which is probably not their intended use case
The code will be way cleaner without supporting both clip-retrieval and Retriever
Let me know if anyone has a demand for having clip-retrieval in their script and I'll add it back in!

isamu-isozaki · 2023-02-27T04:40:25Z

Cleaned up the inference script some more. I think a lot of the common funcs I'll abstract away into some common files like datasets just to avoid copying code wrongly.

isamu-isozaki · 2023-02-27T05:00:55Z

tomorrow I'll hopefully finish cleaning up the training scripts and might ask for a review again!

isamu-isozaki · 2023-02-27T20:11:33Z

Not finished yet but some notes

I can offload a lot of the processes in the dataset by having a pre-computation step where we compute the top k nearest neighbors beforehand and put them in a column. In that way, we save computation on the retrieval
The same can be done with a lot of the processes in diffusers, not just here. Like for example, precomputing text embeddings, vae embeddings etc.

isamu-isozaki · 2023-02-27T20:11:50Z

Anyways stopping a bit here for now but will resume tomorrow!

isamu-isozaki · 2023-03-02T14:43:05Z

@patrickvonplaten @patil-suraj Hi! I think the training scripts might take a while so I can move them to a separate pr for an easier review!

isamu-isozaki · 2023-03-02T14:53:21Z

For now, will be cleaning anyway!

williamberman · 2023-03-21T02:41:58Z

hey @isamu-isozaki if we could isolate the PR to just the pipeline and remove the collosalai pipeline and the training scripts that would be helpful for getting the PR merged

isamu-isozaki · 2023-03-21T04:03:36Z

@williamberman Got it! Sounds good. Will do tomorrow

isamu-isozaki · 2023-03-22T03:43:30Z

@williamberman Hi! Just confirming but do you think I should keep the inference scripts? Can remove them too!

isamu-isozaki · 2023-03-22T03:47:34Z

Let me remove it for now.

github-actions · 2023-04-15T15:03:58Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

isamu-isozaki · 2023-04-15T15:51:01Z

Ah, let me clean up a bit more

isamu-isozaki · 2023-04-27T16:19:48Z

sorry got a bit preoccupied. Let me close this pr and open it once I clean up some things

un1tz3r0 · 2023-04-27T17:26:35Z

Hi so i believe i implemented something similar using clip retrieval back in the days of disco-diffusion (we've come such a long way since then)

I took care to implement an asyncio based high performance paralell downloader, it can grab thousands of images from the URLs returned by clip-retrieval in pretty reasonable time.

The repo is here un1tz3r0/anythingdiffusion

isamu-isozaki · 2023-04-27T23:37:23Z

@un1tz3r0 nice! Looks awesome thanks

patil-suraj and others added 9 commits November 7, 2022 14:24

(hacky) allow attention_head_dim as Tuple

24c7052

add pipeline

87e61c8

add in init

36feb3a

add in init

686d924

fix elcip encode, default args

e96d564

allow passing retrieved_images

b09b8d7

remove negative prompt

a502823

fix docs

a04ca64

Starting retriever

86ea213

Adding retriever class and jupyter notebook for test

db3f2ec

Modified nb

d4dfad9

Made retriever!

19fc4c4

Began working on training script

ff40a90

isamu-isozaki added 4 commits January 1, 2023 12:23

Removed wandb

8d68c9c

Removed unrelavant files

2c1114b

Merged with main

9645bb4

Merged with main

e6289fe

Removed device='cuda' and using model's device instead

6118e0c

isamu-isozaki added 2 commits February 26, 2023 12:40

Removed clip retrieval+cleaning up inference script

0de7534

Fixed up inference script

d124f38

Cleaned up scripts a bit

80a677a

Removing frida pics altho they are cute

72b0412

isamu-isozaki added 2 commits March 21, 2023 23:39

Removed training scripts+colossalai pipeline

7643056

Ran black

e728f98

Remove example

a6dc7b5

github-actions bot added the stale Issues that haven't received updates label Apr 15, 2023

isamu-isozaki closed this Apr 27, 2023

isamu-isozaki mentioned this pull request May 1, 2023

Retrieval Augmented Diffusion Models #3297

Merged

Conversation

isamu-isozaki commented Dec 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

isamu-isozaki commented Dec 28, 2022

Uh oh!

isamu-isozaki commented Dec 28, 2022

Uh oh!

isamu-isozaki commented Dec 29, 2022

Uh oh!

isamu-isozaki commented Dec 29, 2022

Uh oh!

isamu-isozaki commented Dec 29, 2022

Uh oh!

isamu-isozaki commented Dec 29, 2022

Uh oh!

isamu-isozaki commented Dec 29, 2022

Uh oh!

neverix commented Dec 31, 2022

Uh oh!

isamu-isozaki commented Dec 31, 2022

Uh oh!

isamu-isozaki commented Dec 31, 2022

Uh oh!

neverix commented Dec 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

isamu-isozaki commented Jan 1, 2023

Uh oh!

isamu-isozaki commented Jan 1, 2023

Uh oh!

isamu-isozaki commented Feb 26, 2023

Uh oh!

isamu-isozaki commented Feb 26, 2023

Uh oh!

isamu-isozaki commented Feb 26, 2023

Uh oh!

isamu-isozaki commented Feb 26, 2023

Uh oh!

isamu-isozaki commented Feb 27, 2023

Uh oh!

isamu-isozaki commented Feb 27, 2023

Uh oh!

isamu-isozaki commented Feb 27, 2023

Uh oh!

isamu-isozaki commented Feb 27, 2023

Uh oh!

isamu-isozaki commented Mar 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

isamu-isozaki commented Mar 2, 2023

Uh oh!

williamberman commented Mar 21, 2023

Uh oh!

isamu-isozaki commented Mar 21, 2023

Uh oh!

isamu-isozaki commented Mar 22, 2023

Uh oh!

isamu-isozaki commented Mar 22, 2023

Uh oh!

github-actions bot commented Apr 15, 2023

Uh oh!

isamu-isozaki commented Apr 15, 2023

Uh oh!

isamu-isozaki commented Apr 27, 2023

Uh oh!

un1tz3r0 commented Apr 27, 2023

Uh oh!

isamu-isozaki commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

isamu-isozaki commented Dec 27, 2022 •

edited

Loading

neverix commented Dec 31, 2022 •

edited

Loading

isamu-isozaki commented Mar 2, 2023 •

edited

Loading

isamu-isozaki commented Apr 27, 2023 •

edited

Loading