top_p = 1 causes deterministic outputs

Setting `top_p = 1` causes outputs to be identical even with a random seed. This was discovered by https://github.com/oobabooga/text-generation-webui/issues/6431#issuecomment-2409089861. See the full issue at https://github.com/oobabooga/text-generation-webui/issues/6431.
 
### Reproduction

```python
from llama_cpp import Llama

# Load the model
model = Llama(
    model_path="models/Meta-Llama-3-8B-Instruct-Q4_K_S-HF/Meta-Llama-3-8B-Instruct-Q4_K_S.gguf",
    n_gpu_layers=128,
)

# Define the prompt
prompt = "Once upon a time"

for i in range(5):
    # Generate text with temperature = 1
    completion = model.create_completion(prompt=prompt, max_tokens=50, temperature=1.0, top_p=1.0, seed=-1)

    # Print the generated text
    print(completion['choices'][0]['text'])

```

The 5 outputs will be identical.

Verified with **`llama-cpp-python==0.3.1`**.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

top_p = 1 causes deterministic outputs #1797

Reproduction

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

top_p = 1 causes deterministic outputs #1797

Description

Reproduction

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions