Skip to content

Xunzhuo/llm-instance-gateway

 
 

Repository files navigation

Kubernetes LLM Instance Gateway

The LLM Instance Gateway is a part of wg-serving, and this repo contains: the load balancing algorithm, ext-proc code, CRDs, and controllers to support the LLM Instance Gateway.

This Gateway is intented to provide value to multiplexed LLM services on a shared pool of compute. See the proposal for more info.

Status

This project is currently in development.

For more rapid testing, our PoC is in the ./examples/ dir.

Contributing

Our community meeting is weekly at Th 10AM PDT; zoom link here.

We currently utilize the #wg-serving slack channel for communications.

Contributions are readily welcomed, thanks for joining us!

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

About

LLM Instance gateway implementation.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 99.0%
  • Dockerfile 1.0%