-
Notifications
You must be signed in to change notification settings - Fork 12.4k
cuda : fix rope with partial rotation and non-cont src #14580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ggml-ci
@qnixsynapse Could you push a SYCL fix directly in this branch, as I don't have an environment to test with? |
|
ggml-ci
ggml/src/ggml-cuda/rope.cu
Outdated
dst[i + 0] = x[ix + 0]; | ||
dst[i + 1] = x[ix + 1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dst[i + 0] = x[ix + 0]; | |
dst[i + 1] = x[ix + 1]; | |
dst[idst + 0] = x[ix + 0]; | |
dst[idst + 1] = x[ix + 1]; |
From what I can tell i
and idst
are the same.
Oh you already seems to fix it... Thank you!! |
@jeffbolznv @0cc4m Looks like Vulkan needs a fix too: |
* origin/master: model : fix hunyuan moe chat template (ggml-org#14584) model : add SmolLM3 (ggml-org#14581) memory : fix broken batch splits for recurrent cache (ggml-org#14575) vulkan : fix rope with partial rotation and non-cont src (ggml-org#14582) server: Add ability to mount server at prefix (ggml-org#14544) model : add hunyuan moe (ggml-org#14425) vulkan: increase timeout for CI (ggml-org#14574) cuda : fix rope with partial rotation and non-cont src (ggml-org#14580) CUDA: add bilinear interpolation for upscale (ggml-org#14563) musa: fix build warnings (unused variable) (ggml-org#14561) llama : fix incorrect minicpm3 v_states shape (ggml-org#14571) llama : remove ggml_cont where possible (ggml-org#14568)
* cuda : fix rope non-cont ggml-ci * cont : fix multi-rope + add test ggml-ci * sycl : try fix ggml-ci * cont : fix sycl + clean-up cuda ggml-ci
* cuda : fix rope non-cont ggml-ci * cont : fix multi-rope + add test ggml-ci * sycl : try fix ggml-ci * cont : fix sycl + clean-up cuda ggml-ci
The problem was revealed by #14573 and #14568
The problem occurs for ropes with
n_dim < ne00
+ non-contsrc
.We never really had proper tests for this case because of the
all = false;
logic in thetest-backend-ops
.