Feat: add Flux 1 Lite 8B (Freepik) support #474

stduhpf · 2024-11-20T22:58:18Z

Diffusion model weights: https://huggingface.co/Freepik/flux.1-lite-8B-alpha/blob/main/flux.1-lite-8B-alpha.safetensors.

It's basically Flux 1 [Dev] with most double blocks removed. Flux 1 Lite Q4_k is smaller than Flux 1 Dev Q3_k, while delivering better image quality (in my subjective opinion). It's also about 25% faster during image generation.

.\build\bin\Release\sd.exe --diffusion-model ..\ComfyUI\models\unet\flux.1-lite-8B-alpha-q4_k.gguf --vae ..\ComfyUI\models\vae\ae.q8_0.gguf --clip_l ..\ComfyUI\models\clip\clip_l.q8_0.gguf --t5xxl ..\ComfyUI\models\clip\t5xxl_q4_k.gguf -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -t 24 --vae-tiling --color -v -W 1024 -H 1024

Flux 1 Dev Q3_k (5 114 284 416 Bytes)	Flux 1 Lite Q4_k (4 819 297 120 Bytes)

Green-Sky · 2024-11-21T14:05:20Z

Tested this ontop of my flash attention pr, and the results are good 🚀 .

cuda (rtx 2070 mobile / 8gb vram)

quant	dims	fa	compute buffer size	speed
q3_k	512x512	🔴	398.50 MB(VRAM)	1.67s/it
q3_k	512x512	🟢	248.50 MB(VRAM)	1.39s/it
q3_k	1024x512	🔴	942.75 MB(VRAM)	3.32s/it
q3_k	1024x512	🟢	456.75 MB(VRAM)	2.62s/it
q3_k	768x768	🔴	1105.07 MB(VRAM)	3.91s/it
q3_k	768x768	🟢	505.07 MB(VRAM)	3.01s/it
q3_k	1024x1024	🔴	2577.25 MB(VRAM)	8.06s/it
q3_k	1024x1024	🟢	843.25 MB(VRAM)	5.79s/it
q4_k	512x512	🔴	398.50 MB(VRAM)	1.56s/it
q4_k	512x512	🟢	248.50 MB(VRAM)	1.30s/it
q4_k	1024x512	🔴	942.75 MB(VRAM)	3.10s/it
q4_k	1024x512	🟢	456.75 MB(VRAM)	2.43s/it
q4_k	768x768	🔴	1105.07 MB(VRAM)	3.64s/it
q4_k	768x768	🟢	505.07 MB(VRAM)	2.79s/it
q4_k	1024x1024	🔴	OOM
q4_k	1024x1024	🟢	843.25 MB(VRAM)	5.20s/it

Good stuff, the compute buffer sizes are, unsurprisingly, the same as for dev/schnell and the speed faster.

Green-Sky · 2024-11-21T14:34:26Z

With this model, the flash attention pr and vae-tiling, you can do some obscene stuff like q4_k 2048x1024 on my 8gig vram.

stduhpf · 2024-11-21T14:50:28Z

Flash Attention is still not supported on Vulkan sadly. I might try merging in ggml-org/llama.cpp#10206 later to see how it goes.

leejet · 2024-11-23T03:40:06Z

Thank you for your contribution.

Flux Lite (Freepik) support

56b1103

leejet added 2 commits November 23, 2024 11:38

Merge branch 'master' into freepik

14e9bb3

format code

0da6255

leejet merged commit 6ea8122 into leejet:master Nov 23, 2024
9 checks passed

stduhpf deleted the freepik branch November 23, 2024 10:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: add Flux 1 Lite 8B (Freepik) support #474

Feat: add Flux 1 Lite 8B (Freepik) support #474

Uh oh!

stduhpf commented Nov 20, 2024

Uh oh!

Green-Sky commented Nov 21, 2024 •

edited

Loading

Uh oh!

Green-Sky commented Nov 21, 2024

Uh oh!

stduhpf commented Nov 21, 2024 •

edited

Loading

Uh oh!

leejet commented Nov 23, 2024

Uh oh!

Uh oh!

Uh oh!

Feat: add Flux 1 Lite 8B (Freepik) support #474

Feat: add Flux 1 Lite 8B (Freepik) support #474

Uh oh!

Conversation

stduhpf commented Nov 20, 2024

Uh oh!

Green-Sky commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

cuda (rtx 2070 mobile / 8gb vram)

Uh oh!

Green-Sky commented Nov 21, 2024

Uh oh!

stduhpf commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leejet commented Nov 23, 2024

Uh oh!

Uh oh!

Uh oh!

Green-Sky commented Nov 21, 2024 •

edited

Loading

stduhpf commented Nov 21, 2024 •

edited

Loading