|
| 1 | +<p align="center"> |
| 2 | + <img src="./assets/a%20lovely%20cat.png" width="256x"> |
| 3 | +</p> |
| 4 | + |
| 5 | +# stable-diffusion.cpp |
| 6 | + |
| 7 | +Inference of [Stable Diffusion](https://github.com/CompVis/stable-diffusion) in pure C/C++ |
| 8 | + |
| 9 | +## Features |
| 10 | + |
| 11 | +- Plain C/C++ implementation based on [ggml](https://github.com/ggerganov/ggml), working in the same way as [llama.cpp](https://github.com/ggerganov/llama.cpp) |
| 12 | +- 16-bit, 32-bit float support |
| 13 | +- 4-bit, 5-bit and 8-bit integer quantization support |
| 14 | +- Accelerated memory-efficient CPU inference |
| 15 | +- AVX, AVX2 and AVX512 support for x86 architectures |
| 16 | +- Original `txt2img` mode |
| 17 | +- Negative prompt |
| 18 | +- Sampling method |
| 19 | + - `Euler A` |
| 20 | +- Supported platforms |
| 21 | + - Linux |
| 22 | + - Mac OS |
| 23 | + - Windows |
| 24 | + |
| 25 | +### TODO |
| 26 | + |
| 27 | +- [ ] Original `img2img` mode |
| 28 | +- [ ] More sampling methods |
| 29 | +- [ ] GPU support |
| 30 | +- [ ] Make inference faster |
| 31 | + - The current implementation of ggml_conv_2d is slow and has high memory usage |
| 32 | +- [ ] Continuing to reduce memory usage (quantizing the weights of ggml_conv_2d) |
| 33 | +- [ ] [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) style tokenizer |
| 34 | +- [ ] LoRA support |
| 35 | + |
| 36 | +## Usage |
| 37 | + |
| 38 | +### Get the Code |
| 39 | + |
| 40 | +``` |
| 41 | +git clone --recursive https://github.com/leejet/stable-diffusion.cpp |
| 42 | +cd stable-diffusion.cpp |
| 43 | +``` |
| 44 | + |
| 45 | +### Convert weights |
| 46 | + |
| 47 | +- download original weights(.ckpt or .safetensors). For example |
| 48 | + - Stable Diffusion v1.4 from https://huggingface.co/CompVis/stable-diffusion-v-1-4-original |
| 49 | + - Stable Diffusion v1.5 from https://huggingface.co/runwayml/stable-diffusion-v1-5 |
| 50 | + |
| 51 | + ```shell |
| 52 | + curl -L -O https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt |
| 53 | + # curl -L -O https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors |
| 54 | + ``` |
| 55 | + |
| 56 | +- convert weights to ggml model format |
| 57 | + |
| 58 | + ```shell |
| 59 | + cd models |
| 60 | + pip install -r requirements.txt |
| 61 | + python convert.py [path to weights] --out_type [output precision] |
| 62 | + # For example, python convert.py sd-v1-4.ckpt --out_type f16 |
| 63 | + ``` |
| 64 | + |
| 65 | +### Quantization |
| 66 | + |
| 67 | +You can specify the output model format using the --out_type parameter |
| 68 | + |
| 69 | +- `f16` for 16-bit floating-point |
| 70 | +- `f32` for 32-bit floating-point |
| 71 | +- `q8_0` for 8-bit integer quantization |
| 72 | +- `q5_0` or `q5_1` for 5-bit integer quantization |
| 73 | +- `q4_0` or `q4_1` for 4-bit integer quantization |
| 74 | + |
| 75 | +### Build |
| 76 | + |
| 77 | +```shell |
| 78 | +mkdir build |
| 79 | +cd build |
| 80 | +cmake .. |
| 81 | +cmake --build . --config Release |
| 82 | +``` |
| 83 | + |
| 84 | +#### Using OpenBLAS |
| 85 | + |
| 86 | +``` |
| 87 | +cmake .. -DGGML_OPENBLAS=ON |
| 88 | +cmake --build . --config Release |
| 89 | +``` |
| 90 | +
|
| 91 | +### Run |
| 92 | +
|
| 93 | +``` |
| 94 | +usage: ./sd [arguments] |
| 95 | + |
| 96 | +arguments: |
| 97 | + -h, --help show this help message and exit |
| 98 | + -t, --threads N number of threads to use during computation (default: -1). |
| 99 | + If threads <= 0, then threads will be set to the number of CPU cores |
| 100 | + -m, --model [MODEL] path to model |
| 101 | + -o, --output OUTPUT path to write result image to (default: .\output.png) |
| 102 | + -p, --prompt [PROMPT] the prompt to render |
| 103 | + -n, --negative-prompt PROMPT the negative prompt (default: "") |
| 104 | + --cfg-scale SCALE unconditional guidance scale: (default: 7.0) |
| 105 | + -H, --height H image height, in pixel space (default: 512) |
| 106 | + -W, --width W image width, in pixel space (default: 512) |
| 107 | + --sample-method SAMPLE_METHOD sample method (default: "eular a") |
| 108 | + --steps STEPS number of sample steps (default: 20) |
| 109 | + -s SEED, --seed SEED RNG seed (default: 42, use random seed for < 0) |
| 110 | + -v, --verbose print extra info |
| 111 | +``` |
| 112 | +
|
| 113 | +For example |
| 114 | +
|
| 115 | +``` |
| 116 | +./sd -m ../models/sd-v1-4-ggml-model-f16.bin -p "a lovely cat" |
| 117 | +``` |
| 118 | +
|
| 119 | +Using formats of different precisions will yield results of varying quality. |
| 120 | +
|
| 121 | +| f32 | f16 |q8_0 |q5_0 |q5_1 |q4_0 |q4_1 | |
| 122 | +| ---- |---- |---- |---- |---- |---- |---- | |
| 123 | +|  | | | | | | | |
| 124 | +
|
| 125 | +## Memory/Disk Requirements |
| 126 | +
|
| 127 | +| precision | f32 | f16 |q8_0 |q5_0 |q5_1 |q4_0 |q4_1 | |
| 128 | +| ---- | ---- |---- |---- |---- |---- |---- |---- | |
| 129 | +| **Disk** | 2.8G | 2.0G | 1.7G | 1.6G | 1.6G | 1.5G | 1.5G | |
| 130 | +| **Memory**(txt2img - 512 x 512) | ~4.9G | ~4.1G | ~3.8G | ~3.7G | ~3.7G | ~3.6G | ~3.6G | |
| 131 | +
|
| 132 | +
|
| 133 | +## References |
| 134 | +
|
| 135 | +- [ggml](https://github.com/ggerganov/ggml) |
| 136 | +- [stable-diffusion](https://github.com/CompVis/stable-diffusion) |
| 137 | +- [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) |
| 138 | +- [k-diffusion](https://github.com/crowsonkb/k-diffusion) |
0 commit comments