Learn more at: https://llm-d.ai
Huge shoutout to @vllm_project and @IBMResearch on the new KV Offloading Connector. We’re seeing up to 9x throughput gains on H100s and massive TTFT reductions. 🧵
blog.vllm.ai/2026/01/08/k...
Huge shoutout to @vllm_project and @IBMResearch on the new KV Offloading Connector. We’re seeing up to 9x throughput gains on H100s and massive TTFT reductions. 🧵
blog.vllm.ai/2026/01/08/k...
Check out this breakdown by Cedric Clyburn from Red Hat on how llm-d intelligently routes distributed LLM requests.
🔹 Solves "round robin" congestion
🔹 Disaggregates P/D to save costs
www.youtube.com/watch?v=CNKG...
Check out this breakdown by Cedric Clyburn from Red Hat on how llm-d intelligently routes distributed LLM requests.
🔹 Solves "round robin" congestion
🔹 Disaggregates P/D to save costs
www.youtube.com/watch?v=CNKG...