Lightnews — Scholar-powered news

Sung Kim @sungkim.bsky.social · 6h

Does "Show less like this" work? My "Discovery" is still dominated by political posts.

Reposted by Sung Kim

Kaggle @kaggle.com · 6h

Ready to learn how to build and deploy your own AI agents?

Join the 5-Day AI Agents Intensive Course with Google, happening November 10-14 — a no-cost, hands-on deep dive designed by Google researchers and engineers.

Learn more 👉 rsvp.withgoogle.com/events/googl...

5-Day AI Agents Intensive Course with Google

Join our 5-day AI Agents Intensive Course with Google, November 10–14, to learn how to build, evaluate, and deploy agents.

rsvp.withgoogle.com

3 5

Sung Kim @sungkim.bsky.social · 6h

More high-entropy tokens utilized → Better performance, but less stable training.

Blog: m2po.notion.site/rl-stale-m2po
Paper: Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs? ( arxiv.org/abs/2510.01161 )
GitHub: github.com/Infini-AI-La...

2

Sung Kim @sungkim.bsky.social · 6h

The study finds that stale data can be as informative as on-policy data, unlocking more scalable, asynchronous RL for LLMs.

Why? It reveals the dual nature of high-entropy tokens: while high-entropy tokens are crucial for learning progress, they also introduce instability in the off-policy setting.

1 6

Sung Kim @sungkim.bsky.social · 6h

Paper: arxiv.org/abs/2510.03506
Project: johnlnguyen.com/oneflow

3

Sung Kim @sungkim.bsky.social · 6h

Meta's OneFlow

The non-autoregressive multimodal model that generate text and images concurrently using a single transformer - unifying Edit Flow (text) with Flow Matching (images).

1 1 8

Sung Kim @sungkim.bsky.social · 14h

That’s not going to go well for Oracle or for the rest of the players. Expect a bit of a reckoning.

3

Sung Kim @sungkim.bsky.social · 14h

The funny thing is, the entire GPU market could end up being dragged down by Oracle. Imagine a high-margin enterprise software company (around 70%) stepping into a commodity-like GPU market with maybe 10% margins.

1 1 14

Sung Kim @sungkim.bsky.social · 17h

I’m not sure what to call this, but…

xAI is raising $20B to buy Nvidia GPUs, and Nvidia is chipping in $2B for the round.

5 30

Sung Kim @sungkim.bsky.social · 18h

Vegans love Taco Bell...

Reposted by Sung Kim

Maya @may.as · 1d

I built a new shelf today to hold VHS in my retro media center! and to celebrate I'm watching my official @summoningsalt.bsky.social Halo 2 World Records VHS tape that I got! (the photo is taken on my friend's digicam!)

a photo of a carpeted room with a CRT tv, a game system with controllers on the ground, and a bookshelf partially full of VHS tapes to the side

5 26

Reposted by Sung Kim

Josh Marshall @joshtpm.bsky.social · 1d

Glad Morris is on this issue. Obviously political violence in our society and the openness to it is a big issue. BUT I do think a lot of these polls, intentionally or not, have a way of amplifying very hypothetical questions. Or from a different perspective ... www.gelliottmorris.com/p/most-polls...

Why most polls overstate support for political violence

Misperceptions about the popularity of violence increase public support for it — but you can help change that.

www.gelliottmorris.com

10 64 200

Reposted by Sung Kim

Mark Riedl @markriedl.bsky.social · 19h

Hi all,

Georgia Tech is holding its 2nd Summit on Responsible Computing, AI, and Society October 28-29th in Atlanta.

rcais.github.io

Our speakers and schedule is set. It's going to be great!

If you are planning to be in Atlanta and want to join us, registration is now open on the website.

7 18

Reposted by Sung Kim

thebes @vgel.me · 19h

if you're interesting in gaining a better intuition for how llms behave at inference time, you should try logitloom🌱, the open-source tool i made for exploring token trajectory trees (aka looming) on base and instruct models! more info in thread

🌱 vgel.me/logitloom
💻 github.com/vgel/logitloom

5 23 96

Reposted by Sung Kim

Nature Physics @natphys.nature.com · 1d

Want to read a Comment by the three Nobel Laureates on quantum circuits?

Here it is: ($)

www.nature.com/articles/s41...

Quantum Josephson junction circuits and the dawn of artificial atoms - Nature Physics

In 1985, experiments revealed the quantum behaviour of a macroscopic degree of freedom: the phase difference across a Josephson junction. The authors recount the history of this milestone for the development of superconducting quantum circuits.

www.nature.com

1 17 37

Sung Kim @sungkim.bsky.social · 19h

Blog: alexiajm.github.io/2025/09/29/t...
Code: github.com/SamsungSAILM...
Paper: arxiv.org/abs/2510.04871

Less is More: Recursive Reasoning with Tiny Networks

|| Paper | Code ||

alexiajm.github.io

2

Sung Kim @sungkim.bsky.social · 19h

Less is More: Recursive Reasoning with Tiny Networks
by @alexiajm.bsky.social

Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs.

2 2 18

Sung Kim @sungkim.bsky.social · 19h

I really appreciate Kraig Adams for creating a new genre of silent hiking videos, but one thing that always stands out to me about his content is that he doesn’t seem to enjoy traveling while Harman Hoek seems to love traveling.

Just an observation...

2

Sung Kim @sungkim.bsky.social · 19h

I work from home, and to keep myself company I watch a lot of YouTube videos in the background, especially those by Kraig Adams and Harman Hoek.

1 6

Sung Kim @sungkim.bsky.social · 1d

Just google 'pharma bro' for his identity.

2

Sung Kim @sungkim.bsky.social · 1d

Forget AI - just get notified when a pharma bro tweets. His tweet moves the market.

1 7

Sung Kim @sungkim.bsky.social · 1d

OpenAI proudly declared that Codex had written 80% of the UI for their Agent Builder.

The UI...

8 5 63

Sung Kim @sungkim.bsky.social · 1d

Another survey of Reinforcement Learning

"Reinforcement Learning Meets Large Language Models: A Survey of
Advancements and Applications Across the LLM Lifecycle"

arxiv.org/abs/2509.16679

5

Sung Kim @sungkim.bsky.social · 1d

A bonus, if you combine CCA with Grouped Query Attention (GPQA) then you can tune compression toward either FLOP or memory limits without sacrificing quality.

Paper: arxiv.org/abs/2510.04476

Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space

Multi-headed Attention's (MHA) quadratic compute and linearly growing KV-cache make long-context transformers expensive to train and serve. Prior works such as Grouped Query Attention (GQA) and Multi-...

arxiv.org

6

Sung Kim @sungkim.bsky.social · 1d

Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space

Compress everything (queries, keys, values) into a smaller shared latent space, which slashes:

• Parameters (fewer weights)
• Cache size (smaller KV cache)
• Compute (FLOPs) (less math to do)

1 5