Skip to content

cannot run fine-tuned gpt-oss model correctly #2054

@jiachenguoNU

Description

@jiachenguoNU

Expected Behavior

Should produce the similar output format as llamacpp

Current Behavior

The output is wrong. Maybe related to the harmony format?

The current output of llamacpp-python:
Answer: weak
Llama.generate: 7 prefix-match hit, remaining 1 prompt tokens to eval
form, also known as variational form or integrated form, is a reformulation of a differential equation that involves integrals rather than derivatives.

Question: what is strong form
Answer: strong form, also known as the standard form or classical form, is a formulation of a partial differential equation in which all terms involve derivatives.

Yes, what is weak solution? how to set of a) to b)
how
for

theorem
How do you like
of
Questionary

To solve

Sure

What if
question: strong form

(what is the strong form


# Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

* Physical (or virtual) hardware you are using, e.g. for Linux:

`$ lscpu`

Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Byte Order:                         Little Endian
Address sizes:                      48 bits physical, 48 bits virtual
CPU(s):                             96
On-line CPU(s) list:                0-95
Thread(s) per core:                 2
Core(s) per socket:                 24
Socket(s):                          2
NUMA node(s):                       2
Vendor ID:                          AuthenticAMD
CPU family:                         25
Model:                              1
Model name:                         AMD EPYC 7413 24-Core Processor
Stepping:                           1
Frequency boost:                    enabled
CPU MHz:                            1498.207
CPU max MHz:                        2650.0000
CPU min MHz:                        1500.0000
BogoMIPS:                           5299.85
Virtualization:                     AMD-V
L1d cache:                          1.5 MiB
L1i cache:                          1.5 MiB
L2 cache:                           24 MiB
L3 cache:                           256 MiB
NUMA node0 CPU(s):                  0-23,48-71
NUMA node1 CPU(s):                  24-47,72-95


* Operating System, e.g. for Linux:

`$ uname -a`


Linux athena 5.4.0-176-generic #196-Ubuntu SMP Fri Mar 22 16:46:39 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

* SDK version, e.g. for Linux:

$ python3 --version
Python 3.10.18
$ make --version
GNU Make 4.2.1
$ g++ --version
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0


# Failure Information (for bugs)

Just call the model, didn't produce the correct result. But llammacpp works well, give me the correct output.

# Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

# Make sure the model path is correct for your system!
llm = LlamaCpp(
    model_path="/mnt/a/jgz1751/finetune/step8GGUF/gguf/Model_Final-32x2.4B-Q8_0.gguf",
    temperature=0.75,
    max_tokens=2000,
    top_p=1,
    callback_manager=callback_manager,
    verbose=True,  # Verbose is required to pass to the callback manager
)


question = """
Question: what is weak form
"""
llm.invoke(question)


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions