Skip to content

Dynamic lora load/unload sidecar #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Nov 18, 2024

Conversation

coolkp
Copy link
Contributor

@coolkp coolkp commented Oct 23, 2024

Adding sidecar example for dynamically managing lora adapters on vllm server

Copy link

linux-foundation-easycla bot commented Oct 23, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Oct 23, 2024
@liu-cong
Copy link
Contributor

/assign

Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan Patel <kunjanp@google.com>
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 30, 2024
Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan Patel <kunjanp@google.com>
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 30, 2024
@coolkp
Copy link
Contributor Author

coolkp commented Oct 30, 2024

Thanks @liu-cong @guydc for reviewing, deep apologies for delay in responding. I had my notifications misconfigured, emails went to wrong place :(

Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan Patel <kunjanp@google.com>
…et changes, pull dynamically from configmap

Signed-off-by: Kunjan Patel <kunjanp@google.com>
@coolkp coolkp requested a review from ahg-g November 9, 2024 02:00
Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we placing this under examples?

Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan Patel <kunjanp@google.com>
@coolkp
Copy link
Contributor Author

coolkp commented Nov 11, 2024

why are we placing this under examples?
Where do you suggest we place it?
@ahg-g

Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan Patel <kunjanp@google.com>
@ahg-g
Copy link
Contributor

ahg-g commented Nov 12, 2024

why are we placing this under examples?
Where do you suggest we place it?
@ahg-g

perhaps under a tools directory?

Signed-off-by: Kunjan <kunjanp@google.com>
Signed-off-by: Kunjan <kunjanp@google.com>
@coolkp coolkp requested review from liu-cong and ahg-g November 12, 2024 18:46
coolkp and others added 2 commits November 16, 2024 11:49
Co-authored-by: Abdullah Gharaibeh <40361897+ahg-g@users.noreply.github.com>
Co-authored-by: Abdullah Gharaibeh <40361897+ahg-g@users.noreply.github.com>
@coolkp coolkp requested a review from ahg-g November 16, 2024 19:49
@ahg-g
Copy link
Contributor

ahg-g commented Nov 18, 2024

/lgtm
/approve

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 18, 2024
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 18, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, coolkp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 18, 2024
@k8s-ci-robot k8s-ci-robot merged commit 54ee6d7 into kubernetes-sigs:main Nov 18, 2024
2 checks passed
shaneutt pushed a commit to shaneutt/gateway-api-inference-extension that referenced this pull request Apr 18, 2025
kfswain pushed a commit to kfswain/llm-instance-gateway that referenced this pull request Apr 29, 2025
* Dynamic lora load/unload sidecar

* Formatting

* Resolve README comments

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Address comments on sidecar, store updates in memory, rename base field

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Address comments in example deployment

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Address comments in example deployment

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* base model is optional

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Check health of server before querying

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Check health of server before querying

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Docstrings

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Mock health check in tests

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Refactor configmap, switch to watchfiles to detect symbolic link target changes, pull dynamically from configmap

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Refactor configmap, switch to watchfiles to detect symbolic link target changes, pull dynamically from configmap

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Modify unittests

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Change example host and port to be explicit

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Change example sidecar name

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add warning about using subPath

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add screenshots

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add screenshots

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add testing results

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add testing results

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add config validation

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add config documentation

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add config documentation

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add config validation

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Add config validation

Signed-off-by: Kunjan Patel <kunjanp@google.com>

* Make reconciling non blocking

* Move under tools

Signed-off-by: Kunjan <kunjanp@google.com>

* Move under tools

Signed-off-by: Kunjan <kunjanp@google.com>

* Document usage of sidecar, available by default from 1.29

* Document usage of sidecar, available by default from 1.29

* Document usage of sidecar, available by default from 1.29

Signed-off-by: Kunjan <kunjanp@google.com>

* Update tools/dynamic-lora-sidecar/README.md

Co-authored-by: Abdullah Gharaibeh <40361897+ahg-g@users.noreply.github.com>

* Update tools/dynamic-lora-sidecar/README.md

Co-authored-by: Abdullah Gharaibeh <40361897+ahg-g@users.noreply.github.com>

---------

Signed-off-by: Kunjan Patel <kunjanp@google.com>
Signed-off-by: Kunjan <kunjanp@google.com>
Co-authored-by: Abdullah Gharaibeh <40361897+ahg-g@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants