Skip to content

Tags: duaneking/llama.cpp

Tags

master-924dd22

Toggle master-924dd22's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Quantized dot products for CUDA mul mat vec (ggml-org#2067)

master-051c70d

Toggle master-051c70d's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
llama: Don't double count the sampling time (ggml-org#2107)

master-9e4475f

Toggle master-9e4475f's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fixed OpenCL offloading prints (ggml-org#2082)

master-f257fd2

Toggle master-f257fd2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Add an API example using server.cpp similar to OAI. (ggml-org#2009)

* add api_like_OAI.py
* add evaluated token count to server
* add /v1/ endpoints binding

master-ed9a54e

Toggle master-ed9a54e's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
ggml : sync latest (new ops, macros, refactoring) (ggml-org#2106)

- add ggml_argmax()
- add ggml_tanh()
- add ggml_elu()
- refactor ggml_conv_1d() and variants
- refactor ggml_conv_2d() and variants
- add helper macros to reduce code duplication in ggml.c

master-acc111c

Toggle master-acc111c's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Allow old Make to build server. (ggml-org#2098)

Also make server build by default.

Tested with Make 3.82

master-23c7c6f

Toggle master-23c7c6f's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Update Makefile: clean simple (ggml-org#2097)

master-7f0e9a7

Toggle master-7f0e9a7's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
embd-input: Fix input embedding example unsigned int seed (ggml-org#2105

)

master-7ee76e4

Toggle master-7ee76e4's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Simple webchat for server (ggml-org#1998)

* expose simple web interface on root domain

* embed index and add --path for choosing static dir

* allow server to multithread

because web browsers send a lot of garbage requests we want the server
to multithread when serving 404s for favicon's etc. To avoid blowing up
llama we just take a mutex when it's invoked.


* let's try this with the xxd tool instead and see if msvc is happier with that

* enable server in Makefiles

* add /completion.js file to make it easy to use the server from js

* slightly nicer css

* rework state management into session, expose historyTemplate to settings

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

master-698efad

Toggle master-698efad's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
CI: make the brew update temporarily optional. (ggml-org#2092)

until they decide to fix the brew installation in the macos runners.
see the open issues. eg actions/runner-images#7710