Instruct-Pix2pix support #679

stduhpf · 2025-05-15T18:28:32Z

ref: #61

sd.exe -M img2img --model instruct-pix2pix-00-22000.safetensors -p "turn him into a cyborg" --color --strength 1 -i .\example.jpg --steps 50 --cfg-scale 7.5 --guidance 1.2 --sampling-method euler_a

input	output

sd.exe -M img2img --model instruct-pix2pix-00-22000.safetensors -p "Make it a cat" --strength 1 -i input.png --steps 100 --cfg-scale 7.5 --guidance 1.5 --sampling-method euler_a --schedule karras

input	output

TODOs:

Classifier-free guidance (CFG) for two conditionings
Fix UX (probably best not to reuse distlled guidance for something completely different like img conditionning)
Check if implementation is correct

rmatif · 2025-05-15T21:13:24Z

Awesome! Could you please take a look at cosxl-edit as well? It acts as an ip2p, if I understood correctly. I think we're just missing the EDM VPred schedule

stduhpf · 2025-05-15T23:38:44Z

Awesome! Could you please take a look at cosxl-edit as well? It acts as an ip2p, if I understood correctly. I think we're just missing the EDM VPred schedule

I may take a look at it later.

stduhpf · 2025-05-16T02:00:03Z

For some reason, the "image CFG" (controlled by --guidance flag for now) needs to be very high (>10) to get anything ressembling the input image, this behavior does not match the HuggingFace Demo, or the example on their github. I can't figure out what I'm doing wrong.

stduhpf · 2025-05-16T17:25:26Z

Ah I think I found the issue. By default, the model samples the VAE distribution, bit pix2pix expects the mean of the distribution.

stduhpf · 2025-05-16T18:08:39Z

I'm pretty sure it's working properly now. I think inpaint might be slightly improved too, especially when strength is set to <1 and with higher CFG.

stduhpf · 2025-05-16T20:12:56Z

Awesome! Could you please take a look at cosxl-edit as well? It acts as an ip2p, if I understood correctly. I think we're just missing the EDM VPred schedule

thse ones might also be interesting, and they may be even easier to implement:
https://huggingface.co/diffusers/sdxl-instructpix2pix-768
https://huggingface.co/CaptainZZZ/sd3-instructpix2pix/tree/main

Edit: The SDXL one was pretty easy. Now, I can't figure out how to easily convert sd3.x models from diffusers format to the original format, so I cant test if it would work...

Instruct-p2p support

63a6df3

stduhpf force-pushed the ip2p branch from d0ed7ba to 1e25a9b Compare May 16, 2025 00:00

support 2 conditionings cfg

75af1bd

stduhpf force-pushed the ip2p branch from 1e25a9b to 75af1bd Compare May 16, 2025 01:48

Do not re-encode the exact same image twice

2dd2ded

pix2pix: fixes for 2-cfg

4024765

stduhpf force-pushed the ip2p branch from 6b0247b to 4024765 Compare May 16, 2025 10:37

Fix pix2pix latent inputs + improve inpainting a bit + fix naming

8d5cf8f

stduhpf added 2 commits May 18, 2025 17:07

prepare for other pix2pix-like models

fa56e2f

Support sdxl ip2p

2b13fb1

stduhpf mentioned this pull request May 19, 2025

Add CosXL support (WIP, help needed) #683

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instruct-Pix2pix support #679

Instruct-Pix2pix support #679

stduhpf commented May 15, 2025 •

edited

Loading

rmatif commented May 15, 2025

stduhpf commented May 15, 2025

stduhpf commented May 16, 2025 •

edited

Loading

stduhpf commented May 16, 2025

stduhpf commented May 16, 2025 •

edited

Loading

stduhpf commented May 16, 2025 •

edited

Loading

Instruct-Pix2pix support #679

Are you sure you want to change the base?

Instruct-Pix2pix support #679

Conversation

stduhpf commented May 15, 2025 • edited Loading

rmatif commented May 15, 2025

stduhpf commented May 15, 2025

stduhpf commented May 16, 2025 • edited Loading

stduhpf commented May 16, 2025

stduhpf commented May 16, 2025 • edited Loading

stduhpf commented May 16, 2025 • edited Loading

stduhpf commented May 15, 2025 •

edited

Loading

stduhpf commented May 16, 2025 •

edited

Loading

stduhpf commented May 16, 2025 •

edited

Loading

stduhpf commented May 16, 2025 •

edited

Loading