Skip to content

Eval bug: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B can't convert to gguf #15734

@ATRI-Star

Description

@ATRI-Star

Name and Version

I want to save the fine-tuned model, but it seems that the model has not been added to convert_hf_to_gguf_update.py yet, can the code be fixed soon? Thank you so much!
model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

Operating systems

Windows

GGML backends

CUDA

Hardware

12700kf+3060

Models

No response

Problem description & steps to reproduce

INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:** There are 2 possible reasons for this:
WARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref: #6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh: b0f33aec525001c9de427a8f9958d1c8a3956f476bec64403680521281c032e2
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

Traceback (most recent call last):
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 2966, in set_vocab
self._set_vocab_sentencepiece()
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 981, in _set_vocab_sentencepiece
tokens, scores, toktypes = self._create_vocab_sentencepiece()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 998, in _create_vocab_sentencepiece
raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: C:\Users\cxzhu\Desktop\unsloth\FineFune\model\tokenizer.model

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 8985, in
main()
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 8979, in main
model_instance.write()
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 430, in write
self.prepare_metadata(vocab_only=False)
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 551, in prepare_metadata
self.set_vocab()
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 3669, in set_vocab
super().set_vocab()
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 2968, in set_vocab
self._set_vocab_gpt2()
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 917, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base()
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 641, in get_vocab_base
tokpre = self.get_vocab_base_pre(tokenizer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 905, in get_vocab_base_pre
raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

First Bad Commit

NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

Relevant log output

INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:**          There are 2 possible reasons for this:
WARNING:hf-to-gguf:**          - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:**          - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:**          Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref:     https://github.com/ggml-org/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh:  b0f33aec525001c9de427a8f9958d1c8a3956f476bec64403680521281c032e2
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

Traceback (most recent call last):
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 2966, in set_vocab
    self._set_vocab_sentencepiece()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 981, in _set_vocab_sentencepiece
    tokens, scores, toktypes = self._create_vocab_sentencepiece()
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 998, in _create_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: C:\Users\cxzhu\Desktop\unsloth\FineFune\model\tokenizer.model

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 8985, in <module>
    main()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 8979, in main
    model_instance.write()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 430, in write
    self.prepare_metadata(vocab_only=False)
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 551, in prepare_metadata
    self.set_vocab()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 3669, in set_vocab
    super().set_vocab()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 2968, in set_vocab
    self._set_vocab_gpt2()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 917, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
                               ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 641, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 905, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions