Skip to content

kv-cache : opt mask set input #14600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 17, 2025
Merged

kv-cache : opt mask set input #14600

merged 1 commit into from
Jul 17, 2025

Conversation

ggerganov
Copy link
Member

target #14363

Some micro-optimizations for setting the input KQ mask faster.

./scripts/compare-commits.sh gg/llama-high-throughput gg/kv-cache-mask-opt -m ./models/qwen2.5-0.5b-coder/ggml-model-q8_0.gguf -d 0,16384 -r 5 -p 32 -fa 1 -t 1 -n 0
Model Test t/s gg/llama-high-throughput t/s gg/kv-cache-mask-opt Speedup
qwen2 1B Q8_0 pp32@d16384 735.49 756.51 1.03

@ggerganov ggerganov force-pushed the gg/kv-cache-mask-opt branch from 5f168cc to e6dfccb Compare July 9, 2025 16:57
@ggerganov ggerganov force-pushed the gg/llama-high-throughput branch from 2aa6fa0 to f23950a Compare July 10, 2025 08:04
@ggerganov ggerganov force-pushed the gg/kv-cache-mask-opt branch from e6dfccb to 5974cc0 Compare July 10, 2025 15:24
@ggerganov ggerganov force-pushed the gg/llama-high-throughput branch from f23950a to ab82dc2 Compare July 11, 2025 08:27
@ggerganov ggerganov force-pushed the gg/kv-cache-mask-opt branch from 5974cc0 to 64d18c4 Compare July 11, 2025 08:27
@ggerganov ggerganov force-pushed the gg/llama-high-throughput branch from c43f275 to 886d3f1 Compare July 12, 2025 13:33
@ggerganov ggerganov force-pushed the gg/kv-cache-mask-opt branch from 64d18c4 to 6be0b6a Compare July 12, 2025 13:35
@ggerganov ggerganov mentioned this pull request Jul 13, 2025
10 tasks
Base automatically changed from gg/llama-high-throughput to master July 16, 2025 13:35
@ggerganov ggerganov force-pushed the gg/kv-cache-mask-opt branch from 6be0b6a to e5f31af Compare July 16, 2025 13:41
@ggerganov ggerganov merged commit d9b6910 into master Jul 17, 2025
55 of 58 checks passed
@ggerganov ggerganov deleted the gg/kv-cache-mask-opt branch July 17, 2025 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant