cuda : fix rope with partial rotation and non-cont src #14580

ggerganov · 2025-07-08T06:01:25Z

The problem was revealed by #14573 and #14568

The problem occurs for ropes with n_dim < ne00 + non-cont src.

We never really had proper tests for this case because of the all = false; logic in the test-backend-ops.

ggml-ci

ggerganov · 2025-07-08T06:02:37Z

@qnixsynapse Could you push a SYCL fix directly in this branch, as I don't have an environment to test with?

ggml-ci

ggerganov · 2025-07-08T06:15:51Z

~~Qwen2VL models might have also been affected.~~ Probably not

ggml-ci

JohannesGaessler · 2025-07-08T06:34:44Z

ggml/src/ggml-cuda/rope.cu

+        dst[i + 0] = x[ix + 0];
+        dst[i + 1] = x[ix + 1];


Suggested change

dst[i + 0] = x[ix + 0];

dst[i + 1] = x[ix + 1];

dst[idst + 0] = x[ix + 0];

dst[idst + 1] = x[ix + 1];

From what I can tell i and idst are the same.

ggml-ci

qnixsynapse · 2025-07-08T07:38:31Z

@qnixsynapse Could you push a SYCL fix directly in this branch, as I don't have an environment to test with?

Oh you already seems to fix it... Thank you!!

CISC · 2025-07-08T10:37:47Z

@jeffbolznv @0cc4m Looks like Vulkan needs a fix too:
https://github.com/ggml-org/llama.cpp/actions/runs/16136548794/job/45533971724#step:6:25752

* origin/master: model : fix hunyuan moe chat template (ggml-org#14584) model : add SmolLM3 (ggml-org#14581) memory : fix broken batch splits for recurrent cache (ggml-org#14575) vulkan : fix rope with partial rotation and non-cont src (ggml-org#14582) server: Add ability to mount server at prefix (ggml-org#14544) model : add hunyuan moe (ggml-org#14425) vulkan: increase timeout for CI (ggml-org#14574) cuda : fix rope with partial rotation and non-cont src (ggml-org#14580) CUDA: add bilinear interpolation for upscale (ggml-org#14563) musa: fix build warnings (unused variable) (ggml-org#14561) llama : fix incorrect minicpm3 v_states shape (ggml-org#14571) llama : remove ggml_cont where possible (ggml-org#14568)

* cuda : fix rope non-cont ggml-ci * cont : fix multi-rope + add test ggml-ci * sycl : try fix ggml-ci * cont : fix sycl + clean-up cuda ggml-ci

cuda : fix rope non-cont

2a9b730

ggml-ci

github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Jul 8, 2025

cont : fix multi-rope + add test

31af27a

ggml-ci

sycl : try fix

96998d7

ggml-ci

github-actions bot added the SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language label Jul 8, 2025

JohannesGaessler approved these changes Jul 8, 2025

View reviewed changes

cont : fix sycl + clean-up cuda

bcbf7bc

ggml-ci

ggerganov merged commit 4d0dcd4 into master Jul 8, 2025
55 of 56 checks passed

ggerganov deleted the gg/cuda-fix-rope-non-cont branch July 8, 2025 07:15

jeffbolznv mentioned this pull request Jul 8, 2025

vulkan: fix rope with partial rotation and non-cont src #14582

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cuda : fix rope with partial rotation and non-cont src #14580

cuda : fix rope with partial rotation and non-cont src #14580

Uh oh!

ggerganov commented Jul 8, 2025

Uh oh!

ggerganov commented Jul 8, 2025

Uh oh!

ggerganov commented Jul 8, 2025 •

edited

Loading

Uh oh!

JohannesGaessler Jul 8, 2025

Uh oh!

Uh oh!

qnixsynapse commented Jul 8, 2025

Uh oh!

CISC commented Jul 8, 2025

Uh oh!

Uh oh!

cuda : fix rope with partial rotation and non-cont src #14580

cuda : fix rope with partial rotation and non-cont src #14580

Uh oh!

Conversation

ggerganov commented Jul 8, 2025

Uh oh!

ggerganov commented Jul 8, 2025

Uh oh!

ggerganov commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JohannesGaessler Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qnixsynapse commented Jul 8, 2025

Uh oh!

CISC commented Jul 8, 2025

Uh oh!

Uh oh!

ggerganov commented Jul 8, 2025 •

edited

Loading