Skip to content

Fix loading diffusers model (+support F64/I64 types) #681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 6, 2025

Conversation

stduhpf
Copy link
Contributor

@stduhpf stduhpf commented May 18, 2025

Loading diffusers models wasn't working, this fixes it for me,. I also added support for SDXL diffusers model (the second text encoder wasn't being handled).

During testing, I came across models with 64bit types, so I added support for those too since they are supported by GGML. (Fixes #153, #669)

@stduhpf
Copy link
Contributor Author

stduhpf commented May 18, 2025

It would be nice to get it to work for DiT models too, but it looks like a lot of work, because qkv matrices are split in diffusers format. That would probably require a significant refactor of the model loading logic...
Maybe just refactoring the model-specific code itself to accomodate for the different convetions could be a potential solution too

@vmobilis vmobilis mentioned this pull request May 23, 2025
model.cpp Outdated
Comment on lines 1109 to 1112
std::string new_name = prefix + name;
new_name = convert_tensor_name(new_name);

TensorStorage tensor_storage(new_name, type, ne, n_dims, file_index, ST_HEADER_SIZE_LEN + header_size_ + begin);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is breaking some LoRAs.

@leejet leejet merged commit dafc32d into leejet:master Jul 6, 2025
9 checks passed
@leejet
Copy link
Owner

leejet commented Jul 6, 2025

Thank you for your contribution.

@wbruna
Copy link
Contributor

wbruna commented Jul 8, 2025

@stduhpf , I'm getting an 'f64 unsupported' error trying the model mentioned on #153 ( https://civitai.com/models/7371/rev-animated?modelVersionId=425083 , fp32 file):

./sd --model ./revAnimated_v2Rebirth.safetensors -p flower 
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 7600 XT (RADV NAVI33) (radv) | uma: 0 | fp16: 1 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
[INFO ] stable-diffusion.cpp:210  - loading model from './revAnimated_v2Rebirth.safetensors'
[INFO ] model.cpp:998  - load ./revAnimated_v2Rebirth.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:259  - Version: SD 1.x 
[INFO ] stable-diffusion.cpp:292  - Weight type:                 f32
[INFO ] stable-diffusion.cpp:293  - Conditioner weight type:     f32
[INFO ] stable-diffusion.cpp:294  - Diffusion model weight type: f32
[INFO ] stable-diffusion.cpp:295  - VAE weight type:             f16
  |====================>                             | 459/1130 - 0.00it/sterminate called after throwing an instance of 'std::runtime_error'
  what():  type f64 unsupported for integer quantization: no dequantization available
Abortado (imagem do núcleo gravada)

This is on master-dafc32d .

@stduhpf
Copy link
Contributor Author

stduhpf commented Jul 8, 2025

Yeah, F64 is still not properly supported. I didn't realize there isn't a built in "dequantization" (or rather quantization in that case) function for F64 to F32 in GGML, so I just implemented it with #726. This fix the crash during loading time, but I can't get inference to run with either Vulkan or CPU backends when using the models with F64 weights. Forcing it to use F32 works with the new new changes I made ( --type f32).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unsupported dtype 'F64'
4 participants