Skip to content

Conversation

tempstudio
Copy link
Contributor

Cosyvoice2-0.5B (https://github.com/FunAudioLLM/CosyVoice/blob/main/cosyvoice/vllm/cosyvoice2.py#L93) is a TTS model finetuned on top of the Qwen2-0.5B model, with an extra bias tensor on the decoder head.

This change allows this bias tensor to be loaded for better quality when running Cosyvoice2 in llama.cpp.

@ggerganov ggerganov merged commit b0f0ecc into ggml-org:master Jul 16, 2025
45 of 48 checks passed
@hipudding
Copy link
Collaborator

@tempstudio Hello, may I ask how I should load and run inference with the cosyvoice2 model? Thanks.

@hipudding
Copy link
Collaborator

@tempstudio I was thinking about it — this approach only allows llama.cpp to be used as a library, correct? It can’t yet run inference for CosyVoice directly, as preprocessing and postprocessing would still need to be handled by external code.

@tempstudio
Copy link
Contributor Author

@hipudding Yes you are right, this is a replacement for the LLM part of Cosyvoice. It doesn't cover the FLOW and HIFIGAN parts of Cosyvoice.

@hipudding
Copy link
Collaborator

@hipudding Yes you are right, this is a replacement for the LLM part of Cosyvoice. It doesn't cover the FLOW and HIFIGAN parts of Cosyvoice.

Thanks. Do you have plans to fully support CosyVoice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants