Skip to content

Conversation

twaka
Copy link
Contributor

@twaka twaka commented Apr 8, 2024

Hi,
I'm added min_tokens argument which makes eos token's probability to -inf by logits processor when number of generated tokens is smaller than it.
Note that this implementation doesn't prevent stopping generation by another condition (e.g. max_tokens, stop, stopping_criteria).
Completes #240

@abetlen
Copy link
Owner

abetlen commented Apr 17, 2024

Hey @twaka thank you for the contribution, I'd prefer exposing a MinTokensLogitProcessor instead of adding it as an argument.

@twaka
Copy link
Contributor Author

twaka commented Apr 17, 2024

Thank you for taking a look into!
One problem I could think of is that MinTokensLogitProcessor needs prompt tokens' length.
It would not straightforward to obtain before tokenization occurs in llama_cpp._create_completion especially for chat completion.

@twaka
Copy link
Contributor Author

twaka commented May 8, 2024

@abetlen Sorry for the delay. I've changed the implementation as you suggested. Could you please take a look?

@abetlen
Copy link
Owner

abetlen commented May 14, 2024

Hey @twaka sorry for the delay, and thank you for implementing the change, happy to merge this now!

@abetlen abetlen merged commit 5212fb0 into abetlen:main May 14, 2024
@twaka twaka deleted the min_tokens branch May 15, 2024 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants