Lightnews — Scholar-powered news

Sung Kim

@sungkim.bsky.social

Nvidia released TiDAR, a hybrid architecture that combines diffusion-based parallel drafting with autoregressive sampling into a single model forward pass.

Paper: www.arxiv.org/abs/2511.089...
Model: ???

November 26, 2025 at 2:52 PM

Sung Kim

@sungkim.bsky.social

System Instructions for Gemini 3 Pro

Source: ai.google.dev/gemini-api/d...

November 26, 2025 at 2:49 PM

Sung Kim

@sungkim.bsky.social

Hands-On Machine Learning with Scikit-Learn and PyTorch added an online appendix on State-Space Models (SSMs)

ageron.github.io/homlp/HOMLP_...

November 26, 2025 at 2:47 PM

Sung Kim

@sungkim.bsky.social

"Talagrand's convolution conjecture up to loglog via perturbed reverse heat" by Yuansi Chen

He proves that under the heat semigroup (Pτ) on the Boolean hypercube, any nonnegative function f:{−1,1}n→ℝ+ exhibits a uniform tail bound that is better than that by Markov's inequality.

November 26, 2025 at 6:18 AM

Sung Kim

@sungkim.bsky.social

I don’t know why people say there’s a global HBM Chips shortage. Any Korean can just walk into a local 7-Eleven and buy HBM Chips for about a dollar.

www.chosun.com/english/mark...

7-Eleven, SK Hynix Debut HBM Chips Snack

7-Eleven, SK Hynix Debut HBM Chips Snack Convenience store and semiconductor giant collaborate on HBM-inspired honey-banana snack with promotional stickers and prizes

www.chosun.com

November 26, 2025 at 5:57 AM

Sung Kim

@sungkim.bsky.social

I support Senator Mark Kelly’s fight against Trump and Hegseth, but does he really need to text me asking for a donation?

November 26, 2025 at 2:12 AM

Sung Kim

@sungkim.bsky.social

Understanding the Limitations of Diffusion LLMs through a Probabilistic Perspective by Cunxiao Du, Xinyu Yang, Min Lin, Chao Du and the team

It tries to answer this question - As probabilistic models for language data, how do Diffusion LLMs differ from AR LLMs when fitting the natural language?

Understanding the Limitations of Diffusion LLMs through a Probabilistic Perspective | Notion

Author: Cunxiao Du, Xinyu Yang, Min Lin, Chao Du and the team

www.notion.so

November 26, 2025 at 1:53 AM

Sung Kim

@sungkim.bsky.social

If the rumor is true that Google may sell TPUs, not just offer them through Google Cloud, then it wouldn’t affect NVIDIA much in the near to medium term, but it would disrupt other GPU vendors (like AMD) and it would decimate many ASIC vendors (such as Cerebras, Groq, Tenstorrent, etc.).

Sung Kim @sungkim.bsky.social · 1d

I just need some clarification on these TPU rumors.

Are they implying that Google will sell TPUs, not just offer them through Google Cloud, to other companies like Meta?

November 26, 2025 at 1:23 AM

Sung Kim

@sungkim.bsky.social

Continuous batching is the secret to why vLLM and transformers are fast.

"Continuous batching" by Remi Ouazan and two others.

huggingface.co/blog/continu...

November 26, 2025 at 12:52 AM

Sung Kim

@sungkim.bsky.social

I’m confused about why he didn’t get the job. He answered the question correctly.

November 26, 2025 at 12:26 AM

Sung Kim

@sungkim.bsky.social

Also, do you belong to two gyms so you don’t see the same people all the time? Or is it just me?

Sung Kim @sungkim.bsky.social · 1d

Exercising regularly has both benefits and drawbacks.

- Now I can easily bench two plates, squat three plates, and even do Nordic curls. Yea…
- But some of my most technical and "expensive" jackets fits me very very tight.

November 25, 2025 at 7:14 AM

Sung Kim

@sungkim.bsky.social

Exercising regularly has both benefits and drawbacks.

- Now I can easily bench two plates, squat three plates, and even do Nordic curls. Yea…
- But some of my most technical and "expensive" jackets fits me very very tight.

November 25, 2025 at 7:09 AM

Sung Kim

@sungkim.bsky.social

Salesforce's xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning

An intelligent LLM routing system trained with reinforcement learning to dynamically select optimal models from 20+ available LLMs while optimizing for both performance and cost.

November 25, 2025 at 6:32 AM

Sung Kim

@sungkim.bsky.social

Tencent's HunyuanOCR

An expert, end-to-end OCR model built on Hunyuan's native multimodal architecture and training strategy. This model "supposed to" achieve SOTA performance with only 1 billion parameters, significantly reducing deployment costs.

November 25, 2025 at 6:29 AM

Sung Kim

@sungkim.bsky.social

For more than a decade, Mountain Hardwear’s Ghost Whisperer has been the gold standard in lightweight down jackets and hoodies, but I think they’re really ugly and I own the jacket, the hoodie, and even the pants.

The pants are next-level ugly.

November 25, 2025 at 5:00 AM

Sung Kim

@sungkim.bsky.social

I just need some clarification on these TPU rumors.

Are they implying that Google will sell TPUs, not just offer them through Google Cloud, to other companies like Meta?

November 25, 2025 at 2:03 AM

Sung Kim

@sungkim.bsky.social

An implementation of GPU Puzzles, originally written in Python ( github.com/srush/GPU-Pu... ) to C++, by alexine.

github.com/jalexine/gpu...

November 25, 2025 at 1:01 AM

Sung Kim

@sungkim.bsky.social

Is a pixel-level autoregressive model the one model to rule vision?

Google DeepMind suggests that pixel-by-pixel autoregressive modeling may scale into a truly unified vision paradigm. Their study shows that as resolution increases, model size must grow much faster than the dataset,

November 25, 2025 at 12:31 AM

Sung Kim

@sungkim.bsky.social

Microsoft's Fara-7B (Open-weight)

Their first agentic small language model for computer use. This experimental model includes robust safety measures to aid responsible deployment.

Blog: www.microsoft.com/en-us/resear...
Model: huggingface.co/microsoft/Fa...

Fara-7B: An efficient agentic small language model for computer use

Fara-7B is our first agentic small language model for computer use. This experimental model includes robust safety measures to aid responsible deployment. Despite its size, Fara-7B holds its own again...

www.microsoft.com

November 25, 2025 at 12:16 AM

Sung Kim

@sungkim.bsky.social

Flattening ASTs (and Other Compiler Data Structures) by Adrian Sampson

www.cs.cornell.edu/~asampson/bl...

November 24, 2025 at 11:49 PM

Sung Kim

@sungkim.bsky.social

Korean won going down is really confusing at a high level, because the export economy is booming thanks to AI, yet the won keeps depreciating…

Why? Because Korean retail investors keep buying U.S. equities. If I were them, I’d buy more SK Hynix, not U.S. stocks, but who else would buy meme stocks.

November 24, 2025 at 11:12 PM

Sung Kim

@sungkim.bsky.social

New podcast from Dwarkesh Patel, dropping tomorrow.

November 24, 2025 at 10:47 PM

Sung Kim

@sungkim.bsky.social

EGGROLL (Evolution Guided General Optimization via Low-rank Learning)

This project demonstrates integer-only training of a language model directly on the CPU, completely bypassing the need for GPUs, floating-point arithmetic, or heavy ML frameworks like PyTorch or JAX.

November 24, 2025 at 10:40 PM

Sung Kim

@sungkim.bsky.social

Social media (well X/Twitter) after they turned on the light.

November 24, 2025 at 8:24 AM

Sung Kim

@sungkim.bsky.social

An interesting observation about Korean couple in Korea.

In the subway, when only one seat becomes available for a couple, more often than not the man sits down while the woman remains standing. I observed this across all generations.

November 24, 2025 at 12:56 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news