You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to convert a model to iq3_s or iq3_xxs, gives fatal error and abort.
Error
sd -M convert -m realDream_sdxl6.safetensors --type iq3_s
[INFO ] model.cpp:908- load realDream_sdxl6.safetensors using safetensors format
[INFO ] model.cpp:1985- model tensors mem size: 2183.06MB
|=>|55/2641-0.00it/sOops: found point 103not on grid: 103000/usr/src/debug/stable-diffusion.cpp-vulkan-git/stable-diffusion.cpp/ggml/src/ggml-quants.c:3929: fatal error
ptrace: Operationnot permitted.
No stack.
The program is not being run.
Aborted (core dumped)
Does the crash happen with other models too? This looks like a bug in upstream GGML, and I've never seen this one before.
And yeah, conversion is slow, but it seems a bit hard to optimize it, and it's not something that's typically used a lot, so I don't think improving it has a high priority.
EDIT:
I tried with the same model you linked, and I can confirm the issue. The other SDXL models I tried didn't have this issue. I don't understand enough about the quantization process to figure out what's causing it though.
Might be related to ggml-org/llama.cpp#11773 and the linked issues within. You could try the fix proposed by compilade.
It really looks like the same kind of issue, but that patch doesn't fix the issue here (no changes for iq3_s, and iq3_xxs still fails in the same way despite the eps check).
Summary
Error
Command to test quants:
Test model: realDream_sdxl6 ( SDXL | F16 | 6.46GB )
Speed to convert quants (almost all of them)
Time to finish convertion:
Conclusions
iq3_xxs
,iq3_s
, maybe moreq8_0
is converted 4.65x faster thaniq4_xs
q8_0
>q4_0
System:
The text was updated successfully, but these errors were encountered: