Skip to content(if available)orjump to list(if available)

LLM-D: Kubernetes-Native Distributed Inference at Scale

Kemschumam

What would be the benefit of this project over hosting VLLM in Ray?