Hacking Latent Diffusion Models (2022)

Trying to understand these little AI-brains floating all around the internet in 2022 I bumped into this project called textualinversion You feed it some concept in only 5 pictures, and it can come up with similar ones.

I wanted the model to understand what I'm talking about when I say "oldphone". I gave it some images of my DIY synthphone.

and asked for more of these.

Almost good :) I tweaked a bit on the trainign parameters and now It seems to better understand my questions :)

The good thing about AI is that you can unlearn it things it was not supposed to do in the first place (if you know how you got there) long live ckpt

So given 5 pictures and of this phone I repurposed, came up with it's own "oldphone" concept that should look like the input. LDM was much closer than stablediffusion it seems. Interesting research.

I then further tweaked some of the input layers. Giving the same prompt, It would dramatically change the output to something in a consistent style.

before

after

kaotec []<>

Latent Inversion

Hacking Latent Diffusion Models (2022)