feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p #60

kfirtoledo · 2025-04-25T05:21:13Z

Add support for Kubernetes environment development using GIE with KGateway and vLLM
This PR introduces support for the vllm mode, enabling integration testing of GIE with vLLM.
It also adds support for the vllm-p2p mode, which includes:

Deployment of Redis and LMCache alongside the vLLM image
Peer-to-peer (P2P) communication between vLLM instances
Use of the EPP image to enable kv-cache-aware routing

shaneutt

Looking great. Most of my comments are smaller, but I do have some questions for other folks as to what effect this will have.

Also, cc @elevran @shmuelk who I think should take a look.

deploy/components/inference-gateway/inference-models.yaml

deploy/components/vllm-p2p/deployments/redis-deployment.yaml

deploy/components/vllm-p2p/deployments/vllm-deployment.yaml

deploy/components/vllm/deployments.yaml

deploy/components/vllm/kustomization.yaml

deploy/components/vllm-p2p/kustomization.yaml

deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml

deploy/environments/dev/kubernetes-kgateway/kustomization.yaml

shaneutt · 2025-04-25T16:11:59Z

deploy/environments/dev/kubernetes-vllm/vllm-p2p/kustomization.yaml

@@ -0,0 +1,11 @@
+apiVersion: kustomize.config.k8s.io/v1beta1


Oh ok, I see what you're doing with the naming now. The difference now is that any one of these deployments is deploying only a working VLLM stack, and then you have to deploy your inference-gateway stack separately.

cc @tumido @Gregory-Pereira @vMaroon just wanting to check with you on how this will work with your Helm chart?

scripts/kubernetes-dev-env.sh

deploy/components/vllm-p2p/deployments/redis-deployment.yaml

deploy/components/vllm-p2p/deployments/vllm-deployment.yaml

kfirtoledo · 2025-04-26T23:53:59Z

@shaneutt , PTOL.

DEVELOPMENT.md

deploy/components/vllm/kustomization.yaml

deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml

deploy/environments/dev/kubernetes-kgateway/kustomization.yaml

scripts/kubernetes-dev-env.sh

kfirtoledo · 2025-04-28T12:20:05Z

@shaneutt and @elevran PTAL

DEVELOPMENT.md

deploy/components/vllm/deployments.yaml

deploy/environments/dev/kubernetes-vllm/vllm/kustomization.yaml

shaneutt

Approving to unblock.

Once @elevran is 👍, I'm 👍

elevran · 2025-04-29T13:20:32Z

@kfirtoledo LGTM, any idea on the CICD failure?

…tup for kvcache-aware) Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

kfirtoledo requested review from shaneutt, vMaroon and elevran April 25, 2025 05:21

kfirtoledo added help wanted Extra attention is needed WIP labels Apr 25, 2025

kfirtoledo changed the title ~~feat: add scripts for kubernetes dev env using vLLM and vLLM-p2p~~ feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p Apr 25, 2025

kfirtoledo force-pushed the dev branch from 832da6e to 1b6ab70 Compare April 25, 2025 12:03

shaneutt requested changes Apr 25, 2025

View reviewed changes

shaneutt requested a review from shmuelk April 25, 2025 16:15

kfirtoledo force-pushed the dev branch from 1b6ab70 to f95936f Compare April 25, 2025 19:38

vMaroon reviewed Apr 25, 2025

View reviewed changes

deploy/components/vllm-p2p/deployments/redis-deployment.yaml Outdated Show resolved Hide resolved

kfirtoledo force-pushed the dev branch 2 times, most recently from fc98576 to 1a7fa8e Compare April 26, 2025 00:05

Gregory-Pereira reviewed Apr 26, 2025

View reviewed changes

deploy/components/vllm-p2p/deployments/vllm-deployment.yaml Outdated Show resolved Hide resolved

kfirtoledo force-pushed the dev branch 2 times, most recently from 390c50a to b189362 Compare April 26, 2025 23:52

kfirtoledo removed help wanted Extra attention is needed WIP labels Apr 27, 2025

elevran reviewed Apr 27, 2025

View reviewed changes

kfirtoledo force-pushed the dev branch from b189362 to f5bc959 Compare April 27, 2025 13:47

kfirtoledo linked an issue Apr 28, 2025 that may be closed by this pull request

OpenShift Dev Environment - Full Gateway+GIE Stack Deployment with VLLM and VLLM-P2P mode #72

Closed

shaneutt self-requested a review April 28, 2025 12:50

shaneutt reviewed Apr 28, 2025

View reviewed changes

DEVELOPMENT.md Outdated Show resolved Hide resolved

deploy/components/vllm/deployments.yaml Show resolved Hide resolved

deploy/environments/dev/kubernetes-vllm/vllm/kustomization.yaml Outdated Show resolved Hide resolved

shaneutt approved these changes Apr 29, 2025

View reviewed changes

elevran approved these changes Apr 29, 2025

View reviewed changes

kfirtoledo added 2 commits April 29, 2025 16:50

feat: add scripts for kubernetes dev env using vLLM and vLLM-p2p (se…

bf8017f

…tup for kvcache-aware) Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

[fix]: Small fixes for development YAMLs

78157d5

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

kfirtoledo added 3 commits April 29, 2025 16:50

[fix]: Small fixes for deployment and fix comments

a11e984

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

[fix]: fix typos and edit the Readme and env vars

937bb50

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

[fix] Fix the kind environemnt and set gateway service to be NodePort

17a23e5

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

kfirtoledo force-pushed the dev branch from 1db191f to 17a23e5 Compare April 29, 2025 13:51

kfirtoledo merged commit f67cc34 into neuralmagic:dev Apr 29, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p #60

feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p #60

kfirtoledo commented Apr 25, 2025

shaneutt left a comment

shaneutt Apr 25, 2025

kfirtoledo commented Apr 26, 2025

kfirtoledo commented Apr 28, 2025

shaneutt left a comment

elevran commented Apr 29, 2025

feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p #60

feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p #60

Conversation

kfirtoledo commented Apr 25, 2025

shaneutt left a comment

Choose a reason for hiding this comment

shaneutt Apr 25, 2025

Choose a reason for hiding this comment

kfirtoledo commented Apr 26, 2025

kfirtoledo commented Apr 28, 2025

shaneutt left a comment

Choose a reason for hiding this comment

elevran commented Apr 29, 2025