Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length #6555

Merged
merged 17 commits into from
Apr 12, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
grammars: add troubleshooting section to readme
  • Loading branch information
ochafik committed Apr 8, 2024
commit 07163fb627f5a43d48b9698bdeb1322279c3a379
10 changes: 10 additions & 0 deletions grammars/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,3 +89,13 @@ This guide provides a brief overview. Check out the GBNF files in this directory
```
./main -m <model> --grammar-file grammars/some-grammar.gbnf -p 'Some prompt'
```

## Troubleshooting
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this section. After this gets merged in, I'll write a section on the dangers of left-recursion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably also document the json->grammar converters here, I'll send that separately


Grammars currently have performance gotchas (see https://github.com/ggerganov/llama.cpp/issues/4218).

### Efficient optional repetitions

A common pattern is to allow repetitions of a pattern `x` up to N times.

While semantically correct, the syntax `x? x? x?.... x?` (with N repetitions) will result in extremely slow inference. Instead, you can write `(x (x (x ... (x)?...)?)?)?` (w/ N-deep nesting)