-
Notifications
You must be signed in to change notification settings - Fork 11.4k
Insights: ollama/ollama
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v0.6.5
published
Apr 6, 2025
15 Pull requests merged by 13 people
-
types: include the 'items' and '$defs' fields to properly handle "array" types
#10091 merged
Apr 10, 2025 -
Fix nondeterministic model unload order
#10185 merged
Apr 9, 2025 -
Fix dockerfile.
#9855 merged
Apr 9, 2025 -
fix(integration): move waitgroup Add(1) outside goroutine to avoid potential issue
#10070 merged
Apr 8, 2025 -
kvcache: stub out test structs
#10120 merged
Apr 8, 2025 -
types: add any type and validation for ToolFunction enum
#10166 merged
Apr 8, 2025 -
cleanup: remove OLLAMA_TMPDIR and references to temporary executables
#10182 merged
Apr 8, 2025 -
ollamarunner: Preallocate worst case graph at startup
#10171 merged
Apr 8, 2025 -
Update README.md
#10173 merged
Apr 8, 2025 -
Update README.md
#10156 merged
Apr 7, 2025 -
Update README.md
#10168 merged
Apr 7, 2025 -
types: allow tool function parameters with either a single type or an array of types
#9434 merged
Apr 7, 2025 -
CONTRIBUTING: fix code block formatting
#10169 merged
Apr 7, 2025 -
digest files in parallel
#10134 merged
Apr 7, 2025
23 Pull requests opened by 18 people
-
Added ollama4j-ui
#10129 opened
Apr 4, 2025 -
server: improve spacing for JSON grammar
#10131 opened
Apr 4, 2025 -
create blobs in parallel
#10135 opened
Apr 5, 2025 -
wip: llama4 multimodal
#10141 opened
Apr 5, 2025 -
chore: add miss error check
#10144 opened
Apr 6, 2025 -
CONTRIBUTING: fix rendering the commit messages title format in browsers
#10145 opened
Apr 6, 2025 -
Fix OpenAI model retrieval for models with slashes
#10147 opened
Apr 6, 2025 -
discover: make unique_id check optional for AMD GPU detection
#10150 opened
Apr 6, 2025 -
create: check architecture rather than vision.block_count when importing GGUF
#10162 opened
Apr 7, 2025 -
Server: Enhance API/tag with Capability Information
#10174 opened
Apr 8, 2025 -
chore: add miss Close() func
#10179 opened
Apr 8, 2025 -
feat(installer): add gpu support options
#10186 opened
Apr 9, 2025 -
fix: ensure log file is properly closed after logging completes
#10187 opened
Apr 9, 2025 -
llama: update to commit 7538246e
#10192 opened
Apr 9, 2025 -
scripts/install.sh: make curl progress bar optional
#10196 opened
Apr 9, 2025 -
server: do not attempt to parse offset file as gguf
#10201 opened
Apr 9, 2025 -
Update README.md
#10202 opened
Apr 9, 2025 -
feat: capitalise ollama in ollama help description
#10203 opened
Apr 9, 2025 -
Update README.md
#10220 opened
Apr 10, 2025 -
ggml: Log filesystem errors
#10221 opened
Apr 10, 2025 -
clarify quantization behavior in docs
#10224 opened
Apr 10, 2025 -
ggml: fix crash for when head counts are arrays
#10225 opened
Apr 10, 2025 -
ggml: Fix memory leak on input tensors
#10226 opened
Apr 10, 2025
41 Issues closed by 21 people
-
Quantization Uses System Drive
#10223 closed
Apr 10, 2025 -
hf pull has broken
#10195 closed
Apr 10, 2025 -
Ollama 'API' at localhost:11434 returns a string of numbers, not a response.
#10191 closed
Apr 10, 2025 -
Fails to build on macOS with "fatal error: {'string','cstdint'} file not found"
#7392 closed
Apr 9, 2025 -
how to force ollama to use different cpu runners / how to compile windows avx512 runner?
#6312 closed
Apr 9, 2025 -
CUSTOM_CPU_FLAGS="" / non avx2 build
#8058 closed
Apr 9, 2025 -
go run . Server error
#9012 closed
Apr 9, 2025 -
mistral-small v3.1
#9827 closed
Apr 9, 2025 -
Mistral-small3.1 crashes on prompt
#10175 closed
Apr 9, 2025 -
llama vs ollama
#10190 closed
Apr 9, 2025 -
mistral-small3.1 using too much VRAM
#10177 closed
Apr 8, 2025 -
Tool call - Ollama enforces usage of string in enums for JSON Schema
#10164 closed
Apr 8, 2025 -
Out of memory errors when running `gemma3`
#9791 closed
Apr 8, 2025 -
Very strange RAM behavior with v0.6.4 (memory leak?)
#10132 closed
Apr 8, 2025 -
Exceeding GPU memory even I have 2 GPUs
#10089 closed
Apr 8, 2025 -
Quantized Mistral small 3.1 doesn't utilize NVIDIA GPUs
#10167 closed
Apr 8, 2025 -
NewLlamaServer failed - model requires more system memory for gemma3:12b
#10181 closed
Apr 8, 2025 -
Switched to `nomic-embed-text` model but still get `8192` dimension
#10176 closed
Apr 8, 2025 -
options (temp, etc) print values list in python
#9284 closed
Apr 8, 2025 -
`OLLAMA_CONTEXT_LENGTH=4096` but `OllamaEmbeddings` still shows `8192`
#10149 closed
Apr 8, 2025 -
Tools and properties.type Not Supporting Arrays
#5990 closed
Apr 7, 2025 -
404 not found
#9675 closed
Apr 7, 2025 -
llama.cpp server API compatibility
#8579 closed
Apr 7, 2025 -
How to cancel a generate task by ollama restful api
#9372 closed
Apr 7, 2025 -
Unable to run current version on MacOS
#9736 closed
Apr 7, 2025 -
GPU Not Being Used Despite CUDA Installation and GPU Detection (Ollama 0.6.3 on Arch Linux(
#10075 closed
Apr 7, 2025 -
Models listed aren't in order
#10153 closed
Apr 7, 2025 -
Add full support for omni models
#10004 closed
Apr 7, 2025 -
dial tcp: lookup dd20bb891979d25aebc8bec07b2b3bbc.r2.cloudflarestorage.com: no such host
#10151 closed
Apr 7, 2025 -
Llama4ForConditionalGeneration unsupported issue
#10158 closed
Apr 7, 2025 -
Failed to load `mistral-small:24b-3.1-instruct-2503-q4_K_M`
#10154 closed
Apr 6, 2025 -
ollama version 5.11.0 is too slow in the generate process
#10140 closed
Apr 6, 2025 -
Performance is terrible
#10137 closed
Apr 5, 2025 -
Running Ollama as a k8s STS with external script as entrypoint to load models
#10122 closed
Apr 5, 2025 -
Ollama 0.6.0 with gemma3 can't load models from mounted Cloud Storage bucket on Cloud Run
#9691 closed
Apr 5, 2025 -
llama3-gradient:1048k stuck at loading model
#10111 closed
Apr 4, 2025 -
How to allocate more to the GPU?
#10124 closed
Apr 4, 2025
47 Issues opened by 41 people
-
Support Jinja chat templates
#10222 opened
Apr 10, 2025 -
System crashes when attempting to load a model that exceeds RAM capacity
#10219 opened
Apr 10, 2025 -
Image recognition doesn't work with models downloaded from another site
#10218 opened
Apr 10, 2025 -
mistral-small3.1 is not loaded fully to GPU on RX 7900 XTX
#10217 opened
Apr 10, 2025 -
8*H100 server didn't use GPU to run model
#10216 opened
Apr 10, 2025 -
Concurrency Does Not Scale When Increasing GPUs from 2x to 4x RTX 4090 serving `qwq` model
#10214 opened
Apr 10, 2025 -
Using JSON-structured output seems to affect the model's output.
#10213 opened
Apr 10, 2025 -
request: run models directly in the browser
#10212 opened
Apr 10, 2025 -
Error: could not connect to ollama app, is it running? (MacOS)
#10211 opened
Apr 10, 2025 -
ollama.com: profile links section incomplete
#10210 opened
Apr 10, 2025 -
CLI: environment variable to disable stream
#10209 opened
Apr 10, 2025 -
ollama show --verbose reporting wrong infromation
#10208 opened
Apr 10, 2025 -
API Default Model
#10207 opened
Apr 10, 2025 -
There are ads on the official website models?
#10206 opened
Apr 10, 2025 -
Ollama ignores CUDA after reboot, falls back to CPU only
#10204 opened
Apr 9, 2025 -
[Documentation] Automation of addition of latest models in readme through Github Actions
#10200 opened
Apr 9, 2025 -
Error: vocabulary is larger than expected '262145' instead of '262144'
#10199 opened
Apr 9, 2025 -
windows系统的,点开就默认安装c盘了,可以让修改一下么?
#10198 opened
Apr 9, 2025 -
Add Human Feedback
#10194 opened
Apr 9, 2025 -
Supporting private Hugging Face repos hosted on Jfrog Artifactory
#10193 opened
Apr 9, 2025 -
双服务器显卡共用
#10189 opened
Apr 9, 2025 -
Support Dream 7b
#10188 opened
Apr 9, 2025 -
Some kernel names are not shown on Nvidia Nsight System
#10184 opened
Apr 8, 2025 -
Understanding context length
#10183 opened
Apr 8, 2025 -
ollama run phi4-mini error
#10180 opened
Apr 8, 2025 -
Option to disable CPU fallback for SOC with unified memory
#10178 opened
Apr 8, 2025 -
tensor-split problem
#10172 opened
Apr 8, 2025 -
Multimodal broken in 6.5?
#10170 opened
Apr 7, 2025 -
Capitalize Ollama in `ollama` help description
#10165 opened
Apr 7, 2025 -
qwen2.5:72b and llama3:70b not using GPU – extremely slow and consume 40GB+ RAM
#10163 opened
Apr 7, 2025 -
Will --ctx-size 24576 override the environment variable OLLAMA_CONTEXT_LENGTH?
#10160 opened
Apr 7, 2025 -
gemma3:27b gets stuck into generating the same token and producing useless gibberish output
#10159 opened
Apr 7, 2025 -
Q6 quant (with vision support) for mistral-small:24b-3.1-instruct-2503?
#10157 opened
Apr 7, 2025 -
Feature Request:Always use the GPU
#10155 opened
Apr 7, 2025 -
RX580
#10152 opened
Apr 6, 2025 -
Llama 4 support
#10143 opened
Apr 5, 2025 -
Using Ollama Run -> /clear = Not clearing chat context
#10142 opened
Apr 5, 2025 -
OpenAI API: Models with slashes not retrievable
#10139 opened
Apr 5, 2025 -
Installer corrupted: cublastLt64_12.dll corrupted + other files.
#10138 opened
Apr 5, 2025 -
Where is the log file and how can I configure the location of it?
#10136 opened
Apr 5, 2025 -
Ai Vision - Wildy Different Results Between Gemma3 34b OpenAI / Vs Ollama Endpoint
#10130 opened
Apr 4, 2025 -
Incorrect VRAM estimation
#10128 opened
Apr 4, 2025 -
panic: failed to decode batch: could not find a kv cache slot (length: 6656)
#10127 opened
Apr 4, 2025 -
qwen2.5-coder-cline:14b not using NVIDIA GPU
#10125 opened
Apr 4, 2025 -
Structure output "allOf":
#10123 opened
Apr 4, 2025 -
Can't use official QAT GGUF of Gemma-3-27b-it
#10121 opened
Apr 4, 2025
81 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
llama: remove model loading for grammar
#10096 commented on
Apr 10, 2025 • 19 new comments -
cmd: default to client2 and simplify pull progress display
#10069 commented on
Apr 9, 2025 • 6 new comments -
add ollama stop all
#10043 commented on
Apr 7, 2025 • 2 new comments -
Granite new engine
#9966 commented on
Apr 4, 2025 • 1 new comment -
support no caching when kvCacheType is "nocache" for deterministic completion
#10064 commented on
Apr 6, 2025 • 0 new comments -
NVIDIA GPU drivers not loaded on Jeston Orin Nano
#9503 commented on
Apr 10, 2025 • 0 new comments -
SPAM in your model database
#9134 commented on
Apr 10, 2025 • 0 new comments -
System memory leak. Gemma3
#10040 commented on
Apr 10, 2025 • 0 new comments -
SIGSEGV: segmentation violation
#9665 commented on
Apr 10, 2025 • 0 new comments -
AMD Ryzen NPU support
#5186 commented on
Apr 10, 2025 • 0 new comments -
AMD RX9070/9070XT support
#9812 commented on
Apr 10, 2025 • 0 new comments -
vram usage does not go back down after model unloads - stuck in Stopping...
#7606 commented on
Apr 9, 2025 • 0 new comments -
[ ROCm error: out of memory ] Runner Terminated: num_ctx within model / hardware limits reliably crashes
#9957 commented on
Apr 9, 2025 • 0 new comments -
Models can't be stopped correctly when using Webui combine with Ollama.
#8969 commented on
Apr 9, 2025 • 0 new comments -
qwen 2.5 coder stuck "Stopping"
#8178 commented on
Apr 9, 2025 • 0 new comments -
Stopping misbehaving model after some amount of time
#9617 commented on
Apr 9, 2025 • 0 new comments -
Provide logits or logprobs in the API
#2415 commented on
Apr 9, 2025 • 0 new comments -
Ollama always choses iGPU for computations in hybrind discrete+iGPU rocm setups
#9588 commented on
Apr 9, 2025 • 0 new comments -
Speed ten times slower than llamafile
#8305 commented on
Apr 9, 2025 • 0 new comments -
Batch embeddings get progressively worse with larger batches
#6262 commented on
Apr 9, 2025 • 0 new comments -
`pulling manifest Error: EOF` when pulling after disk is full
#1731 commented on
Apr 9, 2025 • 0 new comments -
Ollama errors on older versions of Linux/GLIBC on 0.5.13
#9506 commented on
Apr 9, 2025 • 0 new comments -
Ollama REFUSES to use GFX803 EVEN when detected
#9807 commented on
Apr 9, 2025 • 0 new comments -
Adding a search command
#10046 commented on
Apr 6, 2025 • 0 new comments -
server: enable content streaming with tools
#10028 commented on
Apr 4, 2025 • 0 new comments -
server: prevent model thrashing from unset API fields
#10003 commented on
Apr 7, 2025 • 0 new comments -
server: support streaming near tool usage
#9973 commented on
Apr 7, 2025 • 0 new comments -
Integration test improvements
#9654 commented on
Apr 9, 2025 • 0 new comments -
feat: add debug logging in chat/generate functions
#8957 commented on
Apr 6, 2025 • 0 new comments -
feat: Support Moore Threads GPU
#7554 commented on
Apr 10, 2025 • 0 new comments -
fix: consider any status code as redirect
#7231 commented on
Apr 10, 2025 • 0 new comments -
FEAT: add rerank support
#7219 commented on
Apr 8, 2025 • 0 new comments -
AMD integrated graphic on linux kernel 6.9.9+, GTT memory, loading freeze fix
#6282 commented on
Apr 5, 2025 • 0 new comments -
Enable AMD iGPU 780M in Linux, Create amd-igpu-780m.md
#5426 commented on
Apr 8, 2025 • 0 new comments -
llm/server.go: Fix ollama ps show 100%GPU even use CPU as runner
#4906 commented on
Apr 9, 2025 • 0 new comments -
cobra shell completions
#4690 commented on
Apr 9, 2025 • 0 new comments -
Ollama not freeing and eventually running out of memory [all models]
#10114 commented on
Apr 10, 2025 • 0 new comments -
Ollama hangs while generating a response
#10119 commented on
Apr 10, 2025 • 0 new comments -
Inference with OpenVINO on Intel
#2169 commented on
Apr 10, 2025 • 0 new comments -
CUDA error: an illegal memory access was encountered
#9018 commented on
Apr 10, 2025 • 0 new comments -
Add support for older AMD GPU gfx803, gfx802, gfx805 (e.g. Radeon RX 580, FirePro W7100)
#2453 commented on
Apr 10, 2025 • 0 new comments -
Add an easy way to list all models and their capabilities
#10097 commented on
Apr 6, 2025 • 0 new comments -
Unable to push: max retries exceeded on slower connections
#2155 commented on
Apr 5, 2025 • 0 new comments -
Possibility to remove max retries exceeded when downloading models from a slow connection
#3162 commented on
Apr 5, 2025 • 0 new comments -
Llama.cpp now supports distributed inference across multiple machines.
#4643 commented on
Apr 5, 2025 • 0 new comments -
Available memory calculation on AMD APU no longer takes GTT into account
#5471 commented on
Apr 5, 2025 • 0 new comments -
Allow importing multi-file GGUF models
#5245 commented on
Apr 5, 2025 • 0 new comments -
Deepseek R1, 671b is faster than 70b
#10030 commented on
Apr 5, 2025 • 0 new comments -
support deepseek 671b fp4
#9419 commented on
Apr 4, 2025 • 0 new comments -
gemma EOF error on image input due to improper memory management
#10041 commented on
Apr 4, 2025 • 0 new comments -
ollama does not utilize HBM3 memory on MI300A
#8735 commented on
Apr 4, 2025 • 0 new comments -
EOF with Gemma3:27b | POST predict: Post "http://127.0.0.1:35737/completion": EOF (status code: 500)
#9699 commented on
Apr 4, 2025 • 0 new comments -
Add Generate Embedding for Sparse vector
#6230 commented on
Apr 4, 2025 • 0 new comments -
add /metrics endpoint
#3144 commented on
Apr 4, 2025 • 0 new comments -
Llama3: Generated outputs inconsistent despite seed and temperature
#5321 commented on
Apr 4, 2025 • 0 new comments -
Update DeepSeek V3 to improved version
#9980 commented on
Apr 4, 2025 • 0 new comments -
Using Qwen as agent in VS Code
#10038 commented on
Apr 4, 2025 • 0 new comments -
Pull a model on start or without requiring serve
#3369 commented on
Apr 4, 2025 • 0 new comments -
Unsupported Value NaN in Ollama log
#9639 commented on
Apr 4, 2025 • 0 new comments -
Provide a single command for "serve + pull model", to be used in CI/CD
#5385 commented on
Apr 4, 2025 • 0 new comments -
Support for jinaai/jina-embeddings-v3 embedding model
#6922 commented on
Apr 4, 2025 • 0 new comments -
Ollama Not detecting adapter_config.json file
#9505 commented on
Apr 9, 2025 • 0 new comments -
Tool call support in Qwen 2.5 hallucinates with Maybe pattern
#7051 commented on
Apr 9, 2025 • 0 new comments -
Support for AMD 9000 GPUs
#9633 commented on
Apr 8, 2025 • 0 new comments -
Using split memory (RAM+VRAM) should never happen
#10092 commented on
Apr 8, 2025 • 0 new comments -
Provide an updated OpenAPI Specification file (a/k/a "swagger file") with each release
#3383 commented on
Apr 8, 2025 • 0 new comments -
为什么mac中 gemma3工作不正常
#9939 commented on
Apr 8, 2025 • 0 new comments -
Isn't it time to move onto Omni models?
#6786 commented on
Apr 8, 2025 • 0 new comments -
Compute Capability 3.7 still needed
#9620 commented on
Apr 8, 2025 • 0 new comments -
Support logit_bias
#3795 commented on
Apr 7, 2025 • 0 new comments -
Add support for array for head count GGUF KV
#9984 commented on
Apr 7, 2025 • 0 new comments -
Ollama ps says 22 GB, but nvidia-smi says 16GB with flash attention enabled
#6160 commented on
Apr 7, 2025 • 0 new comments -
Ollama Bug Report: Application Launch Issue
#9832 commented on
Apr 7, 2025 • 0 new comments -
Error: POST predict: Post "http://127.0.0.1:62622/completion": read tcp 127.0.0.1:62627->127.0.0.1:62622: wsarecv: The remote host has closed a connection.
#9674 commented on
Apr 7, 2025 • 0 new comments -
add Qwen2-VL/Qwen2.5-VL
#6564 commented on
Apr 7, 2025 • 0 new comments -
llama_model_load_from_file_impl: failed to load model
#9541 commented on
Apr 7, 2025 • 0 new comments -
Feature request: support for OpenCL
#4373 commented on
Apr 7, 2025 • 0 new comments -
Update broken on Linux
#10101 commented on
Apr 7, 2025 • 0 new comments -
MacOS Ollama not binding to 0.0.0.0
#3581 commented on
Apr 6, 2025 • 0 new comments -
Support AMD GPUs on Intel Macs
#1016 commented on
Apr 6, 2025 • 0 new comments -
[ENHANCE] Add Ubuntu Support for AMD Ryzen AI 9 HX 370 w/ Radeon 890M (gfx1150)
#9999 commented on
Apr 6, 2025 • 0 new comments