@@ -21,7 +21,7 @@ Inference of Stable Diffusion and Flux in pure C/C++
21
21
- Accelerated memory-efficient CPU inference
22
22
- Only requires ~ 2.3GB when using txt2img with fp16 precision to generate a 512x512 image, enabling Flash Attention just requires ~ 1.8GB.
23
23
- AVX, AVX2 and AVX512 support for x86 architectures
24
- - Full CUDA, Metal, Vulkan and SYCL backend for GPU acceleration.
24
+ - Full CUDA, Metal, Vulkan, OpenCL and SYCL backend for GPU acceleration.
25
25
- Can load ckpt, safetensors and diffusers models/checkpoints. Standalone VAEs models
26
26
- No need to convert to ` .ggml ` or ` .gguf ` anymore!
27
27
- Flash Attention for memory usage optimization
@@ -158,7 +158,80 @@ Install Vulkan SDK from https://www.lunarg.com/vulkan-sdk/.
158
158
cmake .. -DSD_VULKAN=ON
159
159
cmake --build . --config Release
160
160
```
161
+ ##### Using Vulkan
162
+
163
+ Install Vulkan SDK from https://www.lunarg.com/vulkan-sdk/ .
164
+
165
+ ```
166
+ cmake .. -DSD_VULKAN=ON
167
+ cmake --build . --config Release
168
+ ```
169
+
170
+ ### Using OpenCL (for Adreno GPU)
171
+
172
+ Currently, it supports only Adreno GPUs and is primarily optimized for Q4_0 type
173
+
174
+ To build for Windows ARM please refers to [ Windows 11 Arm64
175
+ ] ( https://github.com/ggml-org/llama.cpp/blob/master/docs/backend/OPENCL.md#windows-11-arm64 )
161
176
177
+ ** Building for Android:**
178
+
179
+ ** Android NDK:**
180
+ Download and install the Android NDK from the [ official Android developer site] ( https://developer.android.com/ndk/downloads ) .
181
+
182
+ ** Setup OpenCL Dependencies for NDK:**
183
+
184
+ You need to provide OpenCL headers and the ICD loader library to your NDK sysroot.
185
+
186
+ * ** OpenCL Headers:**
187
+ ``` bash
188
+ # In a temporary working directory
189
+ git clone https://github.com/KhronosGroup/OpenCL-Headers
190
+ cd OpenCL-Headers
191
+ # Replace <YOUR_NDK_PATH> with your actual NDK installation path
192
+ # e.g., cp -r CL /path/to/android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include
193
+ sudo cp -r CL < YOUR_NDK_PATH> /toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include
194
+ cd ..
195
+ ```
196
+
197
+ * ** OpenCL ICD Loader:**
198
+ ` ` ` bash
199
+ # In the same temporary working directory
200
+ git clone https://github.com/KhronosGroup/OpenCL-ICD-Loader
201
+ cd OpenCL-ICD-Loader
202
+ mkdir build_ndk && cd build_ndk
203
+
204
+ # Replace <YOUR_NDK_PATH> in the CMAKE_TOOLCHAIN_FILE and OPENCL_ICD_LOADER_HEADERS_DIR
205
+ cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Release \
206
+ -DCMAKE_TOOLCHAIN_FILE=< YOUR_NDK_PATH> /build/cmake/android.toolchain.cmake \
207
+ -DOPENCL_ICD_LOADER_HEADERS_DIR=< YOUR_NDK_PATH> /toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include \
208
+ -DANDROID_ABI=arm64-v8a \
209
+ -DANDROID_PLATFORM=24 \
210
+ -DANDROID_STL=c++_shared
211
+
212
+ ninja
213
+ # Replace <YOUR_NDK_PATH>
214
+ # e.g., cp libOpenCL.so /path/to/android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/lib/aarch64-linux-android
215
+ sudo cp libOpenCL.so < YOUR_NDK_PATH> /toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/lib/aarch64-linux-android
216
+ cd ../..
217
+ ` ` `
218
+
219
+ ** Build ` stable-diffusion.cpp` for Android with OpenCL:**
220
+
221
+ ` ` ` bash
222
+ mkdir build-android && cd build-android
223
+
224
+ # Replace <YOUR_NDK_PATH> with your actual NDK installation path
225
+ # e.g., -DCMAKE_TOOLCHAIN_FILE=/path/to/android-ndk-r26c/build/cmake/android.toolchain.cmake
226
+ cmake .. -G Ninja \
227
+ -DCMAKE_TOOLCHAIN_FILE=< YOUR_NDK_PATH> /build/cmake/android.toolchain.cmake \
228
+ -DANDROID_ABI=arm64-v8a \
229
+ -DANDROID_PLATFORM=android-28 \
230
+ -DGGML_OPENMP=OFF \
231
+ -DSD_OPENCL=ON
232
+
233
+ ninja
234
+ ` ` `
162
235
# #### Using SYCL
163
236
164
237
Using SYCL makes the computation run on the Intel GPU. Please make sure you have installed the related driver and [Intel® oneAPI Base toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html) before start. More details and steps can refer to [llama.cpp SYCL backend](https://github.com/ggerganov/llama.cpp/blob/master/docs/backend/SYCL.md#linux).
0 commit comments