-
Notifications
You must be signed in to change notification settings - Fork 7
Implemented KVCacheAwareScorer #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The force-push rebases onto dev and reimplements the KVCacheScorer along with the new missing logic. |
cb10c4b
to
83a91a6
Compare
The force push undos a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed mainly the scorer itself. there are additional changes around it that are mostly to support the scorer which seems ok (all of them).
overall this LGTM, just a few comments to consider, up to you.
the scorer interface changed in upstream in two main things:
-
now all filters/scorers/picker do not return errors (cause we don't want to stop the scheduling cycle when that happens, you can log the error in your scorer).
-
scorer gets all pods to score and not one by one. this change was driven by the kv scorer which access external service and is much more efficient when getting all pods in one call. accordingly the return value was changed to map[pod]float (map from pod to its score).
these changes are part of kubernetes-sigs#737 which is basically approved, but will be merged probably tomorrow (there were some minor comments that I addressed).
/lgtm
0e3fe98
to
f5bb1ae
Compare
please rebase to dev to get latest pipeline - Thank you |
- added configuration Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
- configured maxscorepicker as default Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
/retest |
Summary
Implemented KVCacheAwareScorer which uses
kvcache.Indexer
as an embedded module.The latter is configured with default configuration + an env settable Redis server address, and a HuggingFace token which is an acceptable practice.
Changes:
Dockerfile
was changed:kvcache.Indexer
directly is having to statically link the HuggingFace Rust tokenizer's Go bindings.kvcache-manager
repo being internal, having Go download its modules requires configuring access. ThereforeGIT_NM_USER
andNM_TOKEN
have to be set.Makefile
'simage-build
target was slightly refactored and updated to support theDockerfile
changes, and building from nonamd64
archs.Follow-up work: