cannot run fine-tuned gpt-oss model correctly

# Expected Behavior

Should produce the similar output format as llamacpp

# Current Behavior
The output is wrong. Maybe related to the harmony format?


The current output of llamacpp-python: 
Answer: weak
Llama.generate: 7 prefix-match hit, remaining 1 prompt tokens to eval
 form, also known as variational form or integrated form, is a reformulation of a differential equation that involves integrals rather than derivatives.

Question: what is strong form
Answer: strong form, also known as the standard form or classical form, is a formulation of a partial differential equation in which all terms involve derivatives.

Yes, what is weak solution? how to set of a) to  b)
how
for
``` 
theorem
How do you like
of
Questionary

To solve

Sure

What if
question: strong form

(what is the strong form


# Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

* Physical (or virtual) hardware you are using, e.g. for Linux:

`$ lscpu`

Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Byte Order:                         Little Endian
Address sizes:                      48 bits physical, 48 bits virtual
CPU(s):                             96
On-line CPU(s) list:                0-95
Thread(s) per core:                 2
Core(s) per socket:                 24
Socket(s):                          2
NUMA node(s):                       2
Vendor ID:                          AuthenticAMD
CPU family:                         25
Model:                              1
Model name:                         AMD EPYC 7413 24-Core Processor
Stepping:                           1
Frequency boost:                    enabled
CPU MHz:                            1498.207
CPU max MHz:                        2650.0000
CPU min MHz:                        1500.0000
BogoMIPS:                           5299.85
Virtualization:                     AMD-V
L1d cache:                          1.5 MiB
L1i cache:                          1.5 MiB
L2 cache:                           24 MiB
L3 cache:                           256 MiB
NUMA node0 CPU(s):                  0-23,48-71
NUMA node1 CPU(s):                  24-47,72-95


* Operating System, e.g. for Linux:

`$ uname -a`


Linux athena 5.4.0-176-generic #196-Ubuntu SMP Fri Mar 22 16:46:39 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

* SDK version, e.g. for Linux:

```
$ python3 --version
Python 3.10.18
$ make --version
GNU Make 4.2.1
$ g++ --version
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
```

# Failure Information (for bugs)

Just call the model, didn't produce the correct result. But llammacpp works well, give me the correct output.

# Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

# Make sure the model path is correct for your system!
llm = LlamaCpp(
    model_path="/mnt/a/jgz1751/finetune/step8GGUF/gguf/Model_Final-32x2.4B-Q8_0.gguf",
    temperature=0.75,
    max_tokens=2000,
    top_p=1,
    callback_manager=callback_manager,
    verbose=True,  # Verbose is required to pass to the callback manager
)


question = """
Question: what is weak form
"""
llm.invoke(question)




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cannot run fine-tuned gpt-oss model correctly #2054

Expected Behavior

Current Behavior

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

cannot run fine-tuned gpt-oss model correctly #2054

Description

Expected Behavior

Current Behavior

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions