-
Notifications
You must be signed in to change notification settings - Fork 12k
server
: update deepseek reasoning format (pass reasoning_content as diffs)
#13933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ffs), add legacy option for compat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I will redo the frontend changes in a more structured way, can you maybe revert these changes?
For now, I think it's worth adding a per-request reasoning format control
// for reasoning model, we split the message into content and thought | ||
// TODO: implement this as remark/rehype plugin in the future | ||
const { content, thought, isThinking }: SplitMessage = useMemo(() => { | ||
if (msg.content === null || msg.role !== 'assistant') { | ||
return { content: msg.content }; | ||
} | ||
let actualContent = ''; | ||
let thought = ''; | ||
let isThinking = false; | ||
let thinkSplit = msg.content.split('<think>', 2); | ||
actualContent += thinkSplit[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If user doesn't explicitly enable deepseek
reasoning format (which is the default behavior), this code is there to make sure the thinking is always parsed.
Unless we can control thinking format per-request, I don't think we can remove this code block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted UI changes
@@ -112,6 +113,7 @@ export interface ViewingChat { | |||
|
|||
export type PendingMessage = Omit<Message, 'content'> & { | |||
content: string | null; | |||
reasoningContent: string | null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh I'm not very confident adding this, because it is not future-proof.
If you look at how chatgpt structure their message, the content
is an array instead of a string, and that is what I wanted to do in near future. It will allow different message parts, like for example a message can contains both reasoning, text response and a tool call
@@ -2869,6 +2869,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex | |||
"(default: deepseek)", | |||
[](common_params & params, const std::string & value) { | |||
/**/ if (value == "deepseek") { params.reasoning_format = COMMON_REASONING_FORMAT_DEEPSEEK; } | |||
else if (value == "deepseek-legacy") { params.reasoning_format = COMMON_REASONING_FORMAT_DEEPSEEK_LEGACY; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add a help message to explain this mode?
Fixes #13867
(
updated / simplified webui accordinglyedit: reverted UI changes, will let @ngxson update it as follow up)