Tags · sroecker/llama.cpp

b2773

metal : remove deprecated error code (ggml-org#7008)

Apr 30, 2024
77e15be
zip
tar.gz
Downloads

b2769

Improve usability of --model-url & related flags (ggml-org#6930)

* args: default --model to models/ + filename from --model-url or --hf-file (or else legacy models/7B/ggml-model-f16.gguf)

* args: main & server now call gpt_params_handle_model_default

* args: define DEFAULT_MODEL_PATH + update cli docs

* curl: check url of previous download (.json metadata w/ url, etag & lastModified)

* args: fix update to quantize-stats.cpp

* curl: support legacy .etag / .lastModified companion files

* curl: rm legacy .etag file support

* curl: reuse regex across headers callback calls

* curl: unique_ptr to manage lifecycle of curl & outfile

* curl: nit: no need for multiline regex flag

* curl: update failed test (model file collision) + gitignore *.gguf.json

Apr 29, 2024
8843a98
zip
tar.gz
Downloads

b2755

Fix more int overflow during quant (PPL/CUDA). (ggml-org#6563)

* Fix more int overflow during quant.

* Fix some more int overflow in softmax.

* Revert back to int64_t.

Apr 28, 2024
e00b4a8
zip
tar.gz
Downloads

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b2773

b2769

b2755

Tags: sroecker/llama.cpp

b2773

b2769

b2755