Feature Request: Make the `/completion` endpoint in `llama-server` work with multimodal models

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

The [documentation](https://github.com/ggml-org/llama.cpp/tree/master/tools/server#post-completion-given-a-prompt-it-returns-the-predicted-completion) for `/completion` has a description for this obsolete field:

> `image_data`: An array of objects to hold base64-encoded image `data` and its `id`s to be reference in `prompt`. You can determine the place of the image in the prompt as in the following: `USER:[img-12]Describe the image in detail.\nASSISTANT:`. In this case, `[img-12]` will be replaced by the embeddings of the image with id `12` in the following `image_data` array: `{..., "image_data": [{"data": "<BASE64_STRING>", "id": 12}]}`. Use `image_data` only with multimodal models, e.g., LLaVA.

However, when passing a prompt with [img-1] to a multimodal model loaded along its corresponding mmproj, the model doesn't understand the image. It works fine with the `/chat/completions` endpoint though.


### Motivation

My [project](https://github.com/oobabooga/text-generation-webui) does its own prompt formatting and communicates with llama.cpp through `/completion`. I would like to integrate llama.cpp's multimodal feature but am unable to due to the limitation above.

### Possible Implementation

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Make the `/completion` endpoint in `llama-server` work with multimodal models #13872

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Make the /completion endpoint in llama-server work with multimodal models #13872

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Feature Request: Make the `/completion` endpoint in `llama-server` work with multimodal models #13872