Orchestrating Kubernetes AI Inference Workloads with NVIDIA Grove — From DRA GA to KAI Scheduler Integration
dev.to
Why Existing Kubernetes Alone Falls Short for AI Inference Workloads In March 2026, at KubeCon Europe 2026 in Amsterdam, NVIDIA officially announced the open-source project Grove. Grove is a Kubernetes API for declaratively defining and orchestrating complex AI inference systems in Kubernetes. The era of simply spinning up a single Pod is over. Modern LLM inference architectures are evolving toward Disaggregated Inference patterns. The Prefill and Decode stages are separated, KV-Cach