Baolong Mao (Tencent), Chunxiao Zheng (Tencent), Weishu Deng (Tensormesh), Darren Peng (Tensormesh), Samuel Shen (Tensormesh) What is P2P and what does it promise? In this blog post, we will go over: a…
Baolong Mao (Tencent), Chunxiao Zheng (Tencent), Weishu Deng (Tensormesh), Darren Peng (Tensormesh), Samuel Shen (Tensormesh) What is P2P and what does it promise? In this blog post, we will go over: a…
In this clip, our 𝗖𝗘𝗢 𝗮𝗻𝗱 𝗰𝗼-𝗳𝗼𝘂𝗻𝗱𝗲𝗿, Junchen Jiang, explains what it really takes to 𝗯𝘂𝗶𝗹𝗱 a 𝗰𝗼𝗺𝗽𝗮𝗻𝘆 at the intersection of academia, open source , and industry.
🎥 Watch the full interview :
👉 y2u.be/zHW4Zzd7pjI
#AIInfrastructure #KVCache #Tensormesh #LLMs
In this clip, our 𝗖𝗘𝗢 𝗮𝗻𝗱 𝗰𝗼-𝗳𝗼𝘂𝗻𝗱𝗲𝗿, Junchen Jiang, explains what it really takes to 𝗯𝘂𝗶𝗹𝗱 a 𝗰𝗼𝗺𝗽𝗮𝗻𝘆 at the intersection of academia, open source , and industry.
🎥 Watch the full interview :
👉 y2u.be/zHW4Zzd7pjI
#AIInfrastructure #KVCache #Tensormesh #LLMs
In this clip, Our 𝗖𝗘𝗢 and 𝗰𝗼-𝗳𝗼𝘂𝗻𝗱𝗲𝗿, 𝗝𝘂𝗻𝗰𝗵𝗲𝗻 𝗝𝗶𝗮𝗻𝗴 , reflects on the moment it clicked that 𝗞𝗩 𝗰𝗮𝗰𝗵𝗶𝗻𝗴 𝘄𝗮𝘀𝗻’𝘁 𝗷𝘂𝘀𝘁 𝗮𝗻 𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻. 𝗕𝘂𝘁 𝗮 𝗳𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝘀𝗵𝗶𝗳𝘁 in how LLM inference should work.
🎥 Watch the full interview on YouTube:
👉 y2u.be/zHW4Zzd7pjI #KVCache
In this clip, Our 𝗖𝗘𝗢 and 𝗰𝗼-𝗳𝗼𝘂𝗻𝗱𝗲𝗿, 𝗝𝘂𝗻𝗰𝗵𝗲𝗻 𝗝𝗶𝗮𝗻𝗴 , reflects on the moment it clicked that 𝗞𝗩 𝗰𝗮𝗰𝗵𝗶𝗻𝗴 𝘄𝗮𝘀𝗻’𝘁 𝗷𝘂𝘀𝘁 𝗮𝗻 𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻. 𝗕𝘂𝘁 𝗮 𝗳𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝘀𝗵𝗶𝗳𝘁 in how LLM inference should work.
🎥 Watch the full interview on YouTube:
👉 y2u.be/zHW4Zzd7pjI #KVCache
🎥 Watch the full interview: youtu.be/zHW4Zzd7pjI
#LLMInference #KVCache #OpenSource #PyTorch
🎥 Watch the full interview: youtu.be/zHW4Zzd7pjI
#LLMInference #KVCache #OpenSource #PyTorch
Shopping for someone who complains about inference costs?
Give them $100 in Tensormesh credits:
✨ 5-10x lower costs
✨ Sub-second latency
✨ Seamless integration
🎁 Claim here: app.tensormesh.ai
Happy holidays! 🚀
Shopping for someone who complains about inference costs?
Give them $100 in Tensormesh credits:
✨ 5-10x lower costs
✨ Sub-second latency
✨ Seamless integration
🎁 Claim here: app.tensormesh.ai
Happy holidays! 🚀
The $255B market has an energy crisis. But there's a solution most companies are missing.
✅ How intelligent caching cuts GPU costs 5-10× ↓
www.tensormesh.ai/blog-posts/a...
The $255B market has an energy crisis. But there's a solution most companies are missing.
✅ How intelligent caching cuts GPU costs 5-10× ↓
www.tensormesh.ai/blog-posts/a...
作者:Yihua Cheng 、Yuhan Liu 、 Jiayi Yao * 、Yuwei An、Xiaokun Chen、Shaoting Feng 、 Yuyang Huang、Samuel Shen、Kuntai Du、Junchen Jiang 单位:TensorMesh&芝加哥大学 摘要 如今的大语言模型(LLM)推理系统为简化设计,将各个推理引擎和请求独立处理,这导致了严重的资源效率低下问题。尽管已有相关方案提出通过跨请求复用KV Cache来避免冗余计算,并通过将单个请求拆分到不同推理引擎来提高 GPU…
作者:Yihua Cheng 、Yuhan Liu 、 Jiayi Yao * 、Yuwei An、Xiaokun Chen、Shaoting Feng 、 Yuyang Huang、Samuel Shen、Kuntai Du、Junchen Jiang 单位:TensorMesh&芝加哥大学 摘要 如今的大语言模型(LLM)推理系统为简化设计,将各个推理引擎和请求独立处理,这导致了严重的资源效率低下问题。尽管已有相关方案提出通过跨请求复用KV Cache来避免冗余计算,并通过将单个请求拆分到不同推理引擎来提高 GPU…
作者:Junchen Jiang 发布Tensormesh 首先我想要在这里重申一遍我上周在LMCache #general Slack频道中发布的一条新闻: “我非常高兴的宣布我们LMCache的创始团队已经在几个月前决定成立名为 Tensormesh 的公司。作为我们第一款产品 Beta 版本的发布,我们决定让Tensormesh正式亮相! 我们与公司同名的产品TensorMesh是一款 SaaS 前端,他允许您在我们所支持的不同硬件厂商的GPU上启动任何开源权重模型,同时对 LMCache 和…
作者:Junchen Jiang 发布Tensormesh 首先我想要在这里重申一遍我上周在LMCache #general Slack频道中发布的一条新闻: “我非常高兴的宣布我们LMCache的创始团队已经在几个月前决定成立名为 Tensormesh 的公司。作为我们第一款产品 Beta 版本的发布,我们决定让Tensormesh正式亮相! 我们与公司同名的产品TensorMesh是一款 SaaS 前端,他允许您在我们所支持的不同硬件厂商的GPU上启动任何开源权重模型,同时对 LMCache 和…
() ( ( ( ( 作者:Kuntai Du 简要总结:🚀LMCache Lab 通过投机解码技术,将代码/文本编辑任务中的解码延迟降低了60%!⚡ --- 你可能是因为 KV cache优化而认识了 LMCache Lab——它让LLM的prefilling变得轻而易举。但这并不是全部!我们现在也专注于加速decoding阶段,让你的LLM智能体生成新内容的速度再上一个台阶。换句话说:在同样的工作量下,你可以少租几台机器,从而省下 LLM…
() ( ( ( ( 作者:Kuntai Du 简要总结:🚀LMCache Lab 通过投机解码技术,将代码/文本编辑任务中的解码延迟降低了60%!⚡ --- 你可能是因为 KV cache优化而认识了 LMCache Lab——它让LLM的prefilling变得轻而易举。但这并不是全部!我们现在也专注于加速decoding阶段,让你的LLM智能体生成新内容的速度再上一个台阶。换句话说:在同样的工作量下,你可以少租几台机器,从而省下 LLM…
() ( ( ( ( ( ( ( 作者:Yihua, Kobe LMCache 现已第一时间支持 OpenAI 最新发布的 GPT-OSS 模型(200 亿与 1200 亿参数)! 本文提供完整指南,教你如何用 vLLM + LMCache 部署 GPT-OSS 模型,并通过 CPU offloading能力获得显著性能提升。 ## 步骤 1:安装 vLLM GPT-OSS 版 ### 安装 ```bash uv pip install --pre vllm==0.10.1+gptoss \…
() ( ( ( ( ( ( ( 作者:Yihua, Kobe LMCache 现已第一时间支持 OpenAI 最新发布的 GPT-OSS 模型(200 亿与 1200 亿参数)! 本文提供完整指南,教你如何用 vLLM + LMCache 部署 GPT-OSS 模型,并通过 CPU offloading能力获得显著性能提升。 ## 步骤 1:安装 vLLM GPT-OSS 版 ### 安装 ```bash uv pip install --pre vllm==0.10.1+gptoss \…
() Best practices, New features AIbrix, dynamo,inference engines, kserve,kubernetes,llm-d, Imignite,Modula,orchestration, orchestrator,production stack,scale, SGLang OME,作者:Junchen Jiang, Hanchen Li and Jake Sonsini…
() Best practices, New features AIbrix, dynamo,inference engines, kserve,kubernetes,llm-d, Imignite,Modula,orchestration, orchestrator,production stack,scale, SGLang OME,作者:Junchen Jiang, Hanchen Li and Jake Sonsini…
Not because models are getting worse
But because your user base is growing and you're treating every request like it's brand new
Here's the infrastructure mistake 90% of AI startups are making:
Not because models are getting worse
But because your user base is growing and you're treating every request like it's brand new
Here's the infrastructure mistake 90% of AI startups are making:
That's golden to our community and everyone
@tensormesh
#kubecon #cncf #AI #LLM #inference
That's golden to our community and everyone
@tensormesh
#kubecon #cncf #AI #LLM #inference
What's been your biggest surprise about scaling costs?
For us it was realizing how much we were recomputing identical work.
Curious what others have hit?
#AIEngineering #MLOps
What's been your biggest surprise about scaling costs?
For us it was realizing how much we were recomputing identical work.
Curious what others have hit?
#AIEngineering #MLOps
Announcing Tensormesh First I wanted to repeat here what I posted on the LMCache #general Slack channel last week: I am delighted to…
https://blog.lmcache.ai/en/2025/10/31/tensormesh-unveiled-and-lmcache-joins-the-pytorch-foundation/
Announcing Tensormesh First I wanted to repeat here what I posted on the LMCache #general Slack channel last week: I am delighted to…
https://blog.lmcache.ai/en/2025/10/31/tensormesh-unveiled-and-lmcache-joins-the-pytorch-foundation/
tensormesh.ai/blog-posts/t...
#llm #ai #kvcache #lmcache #vllm #benchmarking
tensormesh.ai/blog-posts/t...
#llm #ai #kvcache #lmcache #vllm #benchmarking