Skip to content

server: update deepseek reasoning format (pass reasoning_content as diffs) #13933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 2, 2025

Conversation

ochafik
Copy link
Collaborator

@ochafik ochafik commented May 30, 2025

Fixes #13867

(updated / simplified webui accordingly edit: reverted UI changes, will let @ngxson update it as follow up)

@github-actions github-actions bot added testing Everything test related examples server labels May 30, 2025
@ochafik ochafik marked this pull request as ready for review May 31, 2025 00:02
@ochafik ochafik requested a review from ngxson as a code owner May 31, 2025 00:02
@github-actions github-actions bot added the python python script changes label May 31, 2025
Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will redo the frontend changes in a more structured way, can you maybe revert these changes?

For now, I think it's worth adding a per-request reasoning format control

Comment on lines 58 to 68
// for reasoning model, we split the message into content and thought
// TODO: implement this as remark/rehype plugin in the future
const { content, thought, isThinking }: SplitMessage = useMemo(() => {
if (msg.content === null || msg.role !== 'assistant') {
return { content: msg.content };
}
let actualContent = '';
let thought = '';
let isThinking = false;
let thinkSplit = msg.content.split('<think>', 2);
actualContent += thinkSplit[0];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If user doesn't explicitly enable deepseek reasoning format (which is the default behavior), this code is there to make sure the thinking is always parsed.

Unless we can control thinking format per-request, I don't think we can remove this code block

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted UI changes

@@ -112,6 +113,7 @@ export interface ViewingChat {

export type PendingMessage = Omit<Message, 'content'> & {
content: string | null;
reasoningContent: string | null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh I'm not very confident adding this, because it is not future-proof.

If you look at how chatgpt structure their message, the content is an array instead of a string, and that is what I wanted to do in near future. It will allow different message parts, like for example a message can contains both reasoning, text response and a tool call

@@ -2869,6 +2869,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
"(default: deepseek)",
[](common_params & params, const std::string & value) {
/**/ if (value == "deepseek") { params.reasoning_format = COMMON_REASONING_FORMAT_DEEPSEEK; }
else if (value == "deepseek-legacy") { params.reasoning_format = COMMON_REASONING_FORMAT_DEEPSEEK_LEGACY; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a help message to explain this mode?

@ochafik ochafik merged commit c9bbc77 into ggml-org:master Jun 2, 2025
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes server testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Misc. bug: Reasoning content is not separated when streaming
3 participants