Skip to content

add global SYCL compile flags #597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 22, 2025
Merged

add global SYCL compile flags #597

merged 1 commit into from
Feb 22, 2025

Conversation

airMeng
Copy link
Contributor

@airMeng airMeng commented Feb 13, 2025

fix #542

@paoletto @HeyItsBATMAN @Uxito-Ada sorry for the late reply. I am working on GGML community purely voluntarily, and there are some internal changes last months. Please have a try and reply here if doesn't work.

@airMeng airMeng mentioned this pull request Feb 13, 2025
@Uxito-Ada
Copy link

The compiling and runtime work well on my 4* A770 platform with SD v1.5.

@HeyItsBATMAN
Copy link

Works for my B580! Thank you very much.

@airMeng
Copy link
Contributor Author

airMeng commented Feb 14, 2025

@leejet could you merge?

@aahouzi
Copy link

aahouzi commented Feb 14, 2025

@HeyItsBATMAN It's still not working on my B580, can you share your config and OS ?

@HeyItsBATMAN
Copy link

HeyItsBATMAN commented Feb 14, 2025

@HeyItsBATMAN It's still not working on my B580, can you share your config and OS ?
@aahouzi

OS: Arch Linux x86_64
Kernel: Linux 6.14.0-rc2
Oneapi: intel-oneapi-basekit-2025 2025.0.1.46-1

Environment:

ZES_ENABLE_SYSMAN 1
ONEAPI_DEVICE_SELECTOR level_zero:0
SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS 1
SYCL_CACHE_PERSISTENT 1

And I source the following 3 scripts:

source /opt/intel/oneapi/setvars.sh
source /opt/intel/oneapi/pti/0.10/env/vars.sh
source /opt/intel/oneapi/umf/0.9/env/vars.sh

@paoletto
Copy link

Thank you, i confirm i can get it to compile and run on Xe arch (alchemist i believe), integrated on Gen11 i5.
However, i get it crashing with:

ggml_backend_sycl_buffer_type_alloc_buffer: can't malloc 8774997120 Bytes memory on deviceggml_gallocr_reserve_n: failed to allocate SYCL0 buffer of size 8774997120

what i did:

((021bded...))$ source /opt/intel/oneapi/setvars.sh
$ cmake .. -DSD_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx
$ cmake --build . --config Release
$ flushCache.sh
$ free
               total        used        free      shared  buff/cache   available
Mem:        32515656     6369200    23211288     2132720     2935168    23601696
Swap:              0           0           0
$ ./bin/sd -m ~/AI/models/StableDiffusion/CP/realisticVisionV60B1_v51HyperVAE.safetensors  --cfg-scale 5 --steps 30 --sampling-method euler  -H 1024 -W 1024 --seed 42 -p "fantasy medieval village world inside a glass sphere , high detail, fantasy, realistic, light effect, hyper detail, volumetric lighting, cinematic, macro, depth of field, blur, red light and clouds from the back, highly detailed epic cinematic concept art cg render made in maya, blender and photoshop, octane render, excellent composition, dynamic dramatic cinematic lighting, aesthetic, very inspirational, world inside a glass sphere by james gurney by artgerm with james jean, joe fenton and tristan eaton by ross tran, fine details, 4k resolution"

...
[INFO ] stable-diffusion.cpp:1434 - generating image: 1/1 - seed 42
ggml_backend_sycl_buffer_type_alloc_buffer: can't malloc 8774997120 Bytes memory on deviceggml_gallocr_reserve_n: failed to allocate SYCL0 buffer of size 8774997120
[ERROR] ggml_extend.hpp:1052 - unet: failed to allocate the compute buffer


@aahouzi
Copy link

aahouzi commented Feb 14, 2025

@HeyItsBATMAN I see, on windows it's still not running

@Uxito-Ada
Copy link

Thank you, i confirm i can get it to compile and run on Xe arch (alchemist i believe), integrated on Gen11 i5. However, i get it crashing with:

ggml_backend_sycl_buffer_type_alloc_buffer: can't malloc 8774997120 Bytes memory on deviceggml_gallocr_reserve_n: failed to allocate SYCL0 buffer of size 8774997120

what i did:

((021bded...))$ source /opt/intel/oneapi/setvars.sh
$ cmake .. -DSD_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx
$ cmake --build . --config Release
$ flushCache.sh
$ free
               total        used        free      shared  buff/cache   available
Mem:        32515656     6369200    23211288     2132720     2935168    23601696
Swap:              0           0           0
$ ./bin/sd -m ~/AI/models/StableDiffusion/CP/realisticVisionV60B1_v51HyperVAE.safetensors  --cfg-scale 5 --steps 30 --sampling-method euler  -H 1024 -W 1024 --seed 42 -p "fantasy medieval village world inside a glass sphere , high detail, fantasy, realistic, light effect, hyper detail, volumetric lighting, cinematic, macro, depth of field, blur, red light and clouds from the back, highly detailed epic cinematic concept art cg render made in maya, blender and photoshop, octane render, excellent composition, dynamic dramatic cinematic lighting, aesthetic, very inspirational, world inside a glass sphere by james gurney by artgerm with james jean, joe fenton and tristan eaton by ross tran, fine details, 4k resolution"

...
[INFO ] stable-diffusion.cpp:1434 - generating image: 1/1 - seed 42
ggml_backend_sycl_buffer_type_alloc_buffer: can't malloc 8774997120 Bytes memory on deviceggml_gallocr_reserve_n: failed to allocate SYCL0 buffer of size 8774997120
[ERROR] ggml_extend.hpp:1052 - unet: failed to allocate the compute buffer

Hi @paoletto , I encountered the same which means OOM. You can try smaller size of image generation, e.g. -H 512 -W 512.

@paoletto
Copy link

@Uxito-Ada thanks mate! that worked indeed! so i confirm this fix works for me as well!

@airMeng
Copy link
Contributor Author

airMeng commented Feb 18, 2025

@leejet Hi, could you merge this PR since it has been verified in the above comments?

@paoletto
Copy link

paoletto commented Feb 22, 2025

this project look abandoned. I wrote this already some weeks ago. Maybe needs fork?

@stduhpf
Copy link
Contributor

stduhpf commented Feb 22, 2025

this project look abandoned. I wrote this already some weeks ago. Maybe needs fork?

I don't believe it's abandonned, It's just a bit slow to update. If you take a look at the commit history, you can see that leejet comes back every month or so to merge some PR and then disappears again.

(I do have a fork with some more PRs implemented, but I don't think it should replace the original https://github.com/stduhpf/stable-diffusion.cpp/tree/bleedingedge.)

@leejet
Copy link
Owner

leejet commented Feb 22, 2025

Thank you for your contribution

@leejet leejet merged commit 838beb9 into leejet:master Feb 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SyCL intel build broken?
7 participants