Skip to content

Unable to deploy the fine-tuned qwen2.5-vl-7b using llama.cpp. #13723

Closed
@songzhaohui12

Description

@songzhaohui12

I have fine-tuned qwen2.5-vl-7b using unsloth and merged it with LoRA. Now, I need to use llama.cpp to perform Q4 quantization on it. Before that, I converted it to the GGUF format using convert_hf_to_gguf.py. I want to test the performance of the unquantized model first and deployed the model using the following command:
./llama-server -m /root/autodl-tmp/qwen2.5-vl/qwen-gguf/qwen2.5.gguf -c 2048

Image
The model was deployed successfully without any errors. However, when I tested it with the following request:
curl http://127.0.0.1:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "/root/autodl-tmp/qwen2.5-vl/qwen-gguf/qwen2.5.gguf", "messages": [ {"role": "system", "content": "你是一个有用的助手。"}, {"role": "user", "content": [ {"type": "image_url", "image_url": { "url": "https://oss-pai-emcfh1jjcesunsrf7g-cn-guangzhou.oss-cn-guangzhou.aliyuncs.com/031920645691.jpg?Expires=1740968587&OSSAccessKeyId=TMP.3KoFNaN1sZAKuMb8zSRv5Ct65nWvYgsQfACyR9DRFXPzTVTVh4Ym6uQUp8nXcoANAP7MatHJB5Gux1iz2iwRgQEfPM4zpc&Signature=2%2FkCE6f5QkjQhY7t9zsCYSacmiA%3D" } }, {"type": "text", "text": "这是一张电表图片,提取具体的电表读数,总共有6位,最后1位为小数位,小数位不需要提取,只返回最终的电表读数不要返回多余内容"} ]} ] }'
I encountered an error stating that the model does not support image input.

Image
After researching, I found that when deploying multimodal models using llama.cpp, the command generally looks like this:
build/bin/llama-server -m ../models/BroadBit/Qwen2.5-VL-7B-Instruct-Q8_0.gguf --mmproj ../models/BroadBit/mmproj-Qwen2.5-VL-7B-Instruct-f16.gguf -c 32768 -ngl 50 --temp 0.01 -np 1 --host 0.0.0.0 --port 18080 --mlock --no-warmup -t 4
Here, the --mmproj option is used. I would like to know how to generate the corresponding mmproj file when I convert a multimodal model to the GGUF format using llama.cpp. I am not very familiar with llama.cpp and would appreciate guidance from someone experienced.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions