-
Notifications
You must be signed in to change notification settings - Fork 7
feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,11 @@ | |||
apiVersion: kustomize.config.k8s.io/v1beta1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, I see what you're doing with the naming now. The difference now is that any one of these deployments is deploying only a working VLLM stack, and then you have to deploy your inference-gateway stack separately.
cc @tumido @Gregory-Pereira @vMaroon just wanting to check with you on how this will work with your Helm chart?
fc98576
to
1a7fa8e
Compare
390c50a
to
b189362
Compare
@shaneutt , PTOL. |
deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml
Outdated
Show resolved
Hide resolved
deploy/environments/dev/kubernetes-vllm/vllm/kustomization.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to unblock.
Once @elevran is 👍, I'm 👍
@kfirtoledo LGTM, any idea on the CICD failure? |
…tup for kvcache-aware) Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Add support for Kubernetes environment development using GIE with KGateway and vLLM
This PR introduces support for the
vllm
mode, enabling integration testing of GIE with vLLM.It also adds support for the
vllm-p2p
mode, which includes: