sync : ggml #3264

ggerganov · 2025-06-18T07:23:20Z

No description provided.

This commit adds the examples in the "list" of targets to ignore MSVC warnings. The motivation for this is that currently the examples generate a number of warnings that are ignore/disabled for the core ggml project. This makes for a cleaner output when building.

This commit removes the unused `ggml_context_container` structure from the ggml library. It looks like the usage of this struct was removed in Commit 4757fe18d56ec11bf9c07feaca6e9d5b5357e7f4 ("ggml : alloc ggml_contexts on the heap (whisper/2525)"). The motivation for this changes is to improve code clarity/readability.

* ggml : disable warnings for tests when using MSVC This commit disables warnings for tests on windows when using MSVC. The motivation for this is that this brings the build output more inline with what Linux/MacOS systems produce. There is still one warning generated for the tests which is: ```console Building Custom Rule C:/ggml/tests/CMakeLists.txt cl : command line warning D9025: overriding '/DNDEBUG' with '/UNDEBUG' [C:\ggml\build\tests\test-arange.vcxproj] test-arange.cpp test-arange.vcxproj -> C:\ggml\build\bin\Release\test-arange.exe ``` * ggml : fix typo in tests disable list

… device is available, to allow fallback to CPU backend (llama/14099)

Use the same descriptor set layout for all pipelines (MAX_PARAMETER_COUNT == 8) and move it to the vk_device. Move all the descriptor pool and set tracking to the context - none of it is specific to pipelines anymore. It has a single vector of pools and vector of sets, and a single counter to track requests and a single counter to track use.

This change moves the command pool/buffer tracking into a vk_command_pool structure. There are two instances per context (for compute+transfer) and two instances per device for operations that don't go through a context. This should prevent separate contexts from stomping on each other.

* ggml-cpu: Factor out feature detection build from x86 * ggml-cpu: Add ARM feature detection and scoring This is analogous to cpu-feats-x86.cpp. However, to detect compile-time activation of features, we rely on GGML_USE_<FEAT> which need to be set in cmake, instead of GGML_<FEAT> that users would set for x86. This is because on ARM, users specify features with GGML_CPU_ARM_ARCH, rather than with individual flags. * ggml-cpu: Implement GGML_CPU_ALL_VARIANTS for ARM Like x86, however to pass around arch flags within cmake, we use GGML_INTERNAL_<FEAT> as we don't have GGML_<FEAT>. Some features are optional, so we may need to build multiple backends per arch version (armv8.2_1, armv8.2_2, ...), and let the scoring function sort out which one can be used. * ggml-cpu: Limit ARM GGML_CPU_ALL_VARIANTS to Linux for now The other platforms will need their own specific variants. This also fixes the bug that the the variant-building branch was always being executed as the else-branch of GGML_NATIVE=OFF. The branch is moved to an elseif-branch which restores the previous behavior.

* cmake : handle whitepsaces in path during metal build ggml-ci * cont : proper fix ggml-ci --------- Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>

Update oneMath commit to merged PR uxlfoundation/oneMath#669 which adds SYCL-Graph support for recording CUDA BLAS commands. With this change the `MUL_MAT` tests now pass on DPC++ CUDA backends with SYCL-Graph enabled. Prior to this change, an error would be thrown. ``` $ GGML_SYCL_DISABLE_GRAPH=0 ./bin/test-backend-ops -b SYCL0 -o MUL_MAT -p type_a=f16,type_b=f32,m=16,n=1,k=256,bs=\\[1,1\\],nr=\\[2 UR CUDA ERROR: Value: 700 Name: CUDA_ERROR_ILLEGAL_ADDRESS Description: an illegal memory access was encountered Function: operator() Source Location: $HOME/dpcpp/unified-runtime/source/adapters/cuda/queue.cpp:154 Native API failed. Native API returns: 2147483646 (UR_RESULT_ERROR_UNKNOWN) Exception caught at file:$HOME/llama.cpp/ggml/src/ggml-sycl/ggml-sycl.cpp, line:3598, func:operator() SYCL error: CHECK_TRY_ERROR((stream)->wait()): Meet error in this line code! in function ggml_backend_sycl_synchronize at $HOME/llama.cpp/ggml/src/ggml-sycl/ggml-sycl.cpp:3598 $HOME/llama.cpp/ggml/src/ggml-sycl/../ggml-sycl/common.hpp:118: SYCL error Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf ptrace: Operation not permitted. No stack. The program is not being run. ```

…T_SIZE__ (llama/14183)

…196)

* ggml-cpu : rework weak alias on apple targets * fix powerpc detection * fix ppc detection * fix powerpc detection on darwin

This fixes the remaining crash in test-thread-safety on my system.

…14179) * Remove install step for vulkan-shaders-gen * Add install step to normalize msvc with make * Regenerate modified shaders at build-time

* llama : add thread safety test * llamafile : remove global state * llama : better LLAMA_SPLIT_MODE_NONE logic when main_gpu < 0 GPU devices are not used --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* Remove step-targets from vulkan-shaders-gen * Unset DESTDIR when building vulkan-shaders-gen

ggml-ci

danbev and others added 26 commits June 18, 2025 10:21

rpc : nicer error messages for RPC server crash (llama/14076)

9b850cf

Vulkan: Don't default to CPU device (like llvmpipe), even if no other…

19f70d6

… device is available, to allow fallback to CPU backend (llama/14099)

opencl: add mul_mv_id_q4_0_f32_8x_flat (llama/14003)

ea295ba

cmake : handle whitepsaces in path during metal build (llama/14126)

7bc2b17

* cmake : handle whitepsaces in path during metal build ggml-ci * cont : proper fix ggml-ci --------- Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>

sycl: Remove not needed copy f16->f32 for dnnl mul mat (llama/14125)

1e6693c

sycl: Adding additional cpy dbg print output (llama/14034)

28560f9

HIP: Replace usage of depricated preprocessor macro __AMDGCN_WAVEFRON…

39f31e3

…T_SIZE__ (llama/14183)

CUDA/HIP: fix ssm_scan on devices where warp size is not 32 (llama/14…

6c721d6

…196)

ggml-cpu : rework weak alias on apple targets (llama/14146)

4640699

* ggml-cpu : rework weak alias on apple targets * fix powerpc detection * fix ppc detection * fix powerpc detection on darwin

vulkan: mutex around vkQueueSubmit (llama/14127)

2c5f56b

This fixes the remaining crash in test-thread-safety on my system.

ggml: Add Android support for GGML_CPU_ALL_VARIANTS (llama/14206)

c404919

HIP: disable rocwmma on gfx12 by default until rocm 7.0 (llama/14202)

acb45d4

cmake: clean up external project logic for vulkan-shaders-gen (llama/…

91f8856

…14179) * Remove install step for vulkan-shaders-gen * Add install step to normalize msvc with make * Regenerate modified shaders at build-time

llama : add thread safety test (llama/14035)

311fa20

* llama : add thread safety test * llamafile : remove global state * llama : better LLAMA_SPLIT_MODE_NONE logic when main_gpu < 0 GPU devices are not used --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

musa: fix build warning (unused variable) (llama/14231)

2400bdb

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

ggml-cpu : remove the weak alias trick (llama/14221)

9686aec

cmake: remove shader-gen step-targets from ggml-vulkan (llama/14226)

ad8b05b

* Remove step-targets from vulkan-shaders-gen * Unset DESTDIR when building vulkan-shaders-gen

sync : ggml

a8d075d

ggml-ci

talk-llama : sync llama.cpp

9cc9701

ggml-ci

danbev approved these changes Jun 18, 2025

View reviewed changes

ggerganov merged commit 2f60ebc into master Jun 18, 2025
60 of 62 checks passed

ggerganov deleted the sync-ggml-25-06-18 branch June 18, 2025 09:40

ggerganov mentioned this pull request Jun 18, 2025

cmake : fix android build #3265

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sync : ggml #3264

sync : ggml #3264

Uh oh!

ggerganov commented Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!

sync : ggml #3264

sync : ggml #3264

Uh oh!

Conversation

ggerganov commented Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!