Eval bug: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B can't convert to gguf

### Name and Version

I want to save the fine-tuned model, but it seems that the model has not been added to convert_hf_to_gguf_update.py yet, can the code be fixed soon? Thank you so much!
model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

### Operating systems

Windows

### GGML backends

CUDA

### Hardware

12700kf+3060

### Models

_No response_

### Problem description & steps to reproduce

INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:**          There are 2 possible reasons for this:
WARNING:hf-to-gguf:**          - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:**          - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:**          Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref:     https://github.com/ggml-org/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh:  b0f33aec525001c9de427a8f9958d1c8a3956f476bec64403680521281c032e2
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

Traceback (most recent call last):
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 2966, in set_vocab
    self._set_vocab_sentencepiece()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 981, in _set_vocab_sentencepiece
    tokens, scores, toktypes = self._create_vocab_sentencepiece()
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 998, in _create_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: C:\Users\cxzhu\Desktop\unsloth\FineFune\model\tokenizer.model

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 8985, in <module>
    main()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 8979, in main
    model_instance.write()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 430, in write
    self.prepare_metadata(vocab_only=False)
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 551, in prepare_metadata
    self.set_vocab()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 3669, in set_vocab
    super().set_vocab()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 2968, in set_vocab
    self._set_vocab_gpt2()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 917, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
                               ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 641, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 905, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()


### First Bad Commit

NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

### Relevant log output

```shell
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:**          There are 2 possible reasons for this:
WARNING:hf-to-gguf:**          - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:**          - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:**          Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref:     https://github.com/ggml-org/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh:  b0f33aec525001c9de427a8f9958d1c8a3956f476bec64403680521281c032e2
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

Traceback (most recent call last):
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 2966, in set_vocab
    self._set_vocab_sentencepiece()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 981, in _set_vocab_sentencepiece
    tokens, scores, toktypes = self._create_vocab_sentencepiece()
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 998, in _create_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: C:\Users\cxzhu\Desktop\unsloth\FineFune\model\tokenizer.model

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 8985, in <module>
    main()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 8979, in main
    model_instance.write()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 430, in write
    self.prepare_metadata(vocab_only=False)
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 551, in prepare_metadata
    self.set_vocab()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 3669, in set_vocab
    super().set_vocab()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 2968, in set_vocab
    self._set_vocab_gpt2()
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 917, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
                               ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 641, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cxzhu\Desktop\unsloth\FineFune\llama.cpp\convert_hf_to_gguf.py", line 905, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B can't convert to gguf #15734

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B can't convert to gguf #15734

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions