Lightnews — Scholar-powered news

Aidan Scannell

@aidanscannell.bsky.social

48 followers 570 following 0 posts

AI/ML/Robotics. Incoming postdoc at University of Edinburgh, previously at Aalto University.

Posts Replies Media Videos

Reposted by Aidan Scannell

Arno Solin

@arnosolin.bsky.social

Our #ICLR2025 poster "Discrete Codebook World Models for Continuous Control" (Aidan Scannell, Mohammadreza Nakhaeinezhadfard, Kalle Kujanpää, Yi Zhao, Kevin Luck, Arno Solin, Joni Pajarinen)
🗓️ Hall 3 + Hall 2B #415, Thu 24 Apr 10 a.m. +08 — 12:30 p.m. +08
📄 Preprint: arxiv.org/abs/2503.00653

April 21, 2025 at 3:38 PM

Reposted by Aidan Scannell

Paul Chang

@mummitrollet.bsky.social

Multi-Head Latent Attention vs Group Query Attention: We break down why MLA is a more expressive memory compression technique AND why naive implementations can backfire. Check it out!

datacrunch.io @datacrunch.io · Mar 12

⚡️Multi-Head Latent Attention is one of the key innovations that enabled @deepseek_ai's V3 and the subsequent R1 model.

⏭️ Join us as we continue our series into efficient AI inference, covering both theoretical insights and practical implementation:

🔗 datacrunch.io/blog/deepsee...

DeepSeek + SGLang: Multi-Head Latent Attention

Multi-Head Latent Attention (MLA) improves upon Group Query Attention (GQA), enabling long-context reasoning models and wider adoption across open-source LLMs.

datacrunch.io

March 12, 2025 at 7:01 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news