Releases · struct/llama.cpp

12 Jul 23:28

c31e606

b5884 Latest

Latest

tests : cover lfm2 cases in test_ssm_conv (#14651)

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6
373 MB 2025-07-12T23:28:13Z
llama-b5884-bin-macos-arm64.zip

sha256:6df1dd3cdbb53122b12ac1adcc5aa81507d1cc63325d1c15bf0710d2a0a064c3
10.6 MB 2025-07-12T23:28:24Z
llama-b5884-bin-macos-x64.zip

sha256:aacda57e45c936a13ae304ed285feb534537fbb5915667516af2f744d862aca6
26.4 MB 2025-07-12T23:28:24Z
llama-b5884-bin-ubuntu-vulkan-x64.zip

sha256:9df90aa80b17c2909fa472a22d3d7b74279c8ce3ce4357e8d1865399006e8977
20.8 MB 2025-07-12T23:28:26Z
llama-b5884-bin-ubuntu-x64.zip

sha256:ee9f6336c35adbad345c89742df405ac5985257efe79e4bef7918b56c4be51db
12.4 MB 2025-07-12T23:28:27Z
llama-b5884-bin-win-cpu-arm64.zip

sha256:fcb28a3831c2698fa0287e6f253313e57ff43ff00712f5a744eb7cdc3b384340
10.8 MB 2025-07-12T23:28:28Z
llama-b5884-bin-win-cpu-x64.zip

sha256:21ff4504520b79d0c1bf855efe26f2a51948f51d1c0bd4c8c7954b0b6ccb0689
13.6 MB 2025-07-12T23:28:28Z
llama-b5884-bin-win-cuda-12.4-x64.zip

sha256:f93f9b4ec6799d4411788ce64f269b5f8c7fcb6a147e0ec44bc5562b04708c52
129 MB 2025-07-12T23:28:29Z
llama-b5884-bin-win-hip-radeon-x64.zip

sha256:bbd4c82830ff7e4313c6c39a24f94c4da8b5a70b360ccf2415584932d6a68429
298 MB 2025-07-12T23:28:34Z
llama-b5884-bin-win-opencl-adreno-arm64.zip

sha256:f68d317cf423a2d7cceed59afbb92b2abf2e0d4a187f336c63bac73717002ffb
11.2 MB 2025-07-12T23:28:43Z
Source code (zip)

2025-07-12T17:10:14Z
Source code (tar.gz)

2025-07-12T17:10:14Z

08 Jul 23:14

github-actions

b5849

6efcd65

b5849

vulkan: optimize flash attention split_k_reduce (#14554)

* vulkan: allow FA split_k with smaller KV values

* vulkan: spread split_k_reduce work across more threads

k_num can get rather large. Use the whole workgroup to reduce the M/L values.

Launch a thread for each element in the HSV dimension of the output. Helps a
lot for large HSV (like deepseek).

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Releases: struct/llama.cpp

b5884

Uh oh!

b5849

Uh oh!