Sung Kim
sungkim.bsky.social
Sung Kim
@sungkim.bsky.social
A business analyst at heart who enjoys delving into AI, ML, data engineering, data science, data analytics, and modeling. My views are my own.

You can also find me at threads: @sung.kim.mw
Nvidia released TiDAR, a hybrid architecture that combines diffusion-based parallel drafting with autoregressive sampling into a single model forward pass.

Paper: www.arxiv.org/abs/2511.089...
Model: ???
November 26, 2025 at 2:52 PM
System Instructions for Gemini 3 Pro

Source: ai.google.dev/gemini-api/d...
November 26, 2025 at 2:49 PM
Hands-On Machine Learning with Scikit-Learn and PyTorch added an online appendix on State-Space Models (SSMs)

ageron.github.io/homlp/HOMLP_...
November 26, 2025 at 2:47 PM
"Talagrand's convolution conjecture up to loglog via perturbed reverse heat" by Yuansi Chen

He proves that under the heat semigroup (Pτ) on the Boolean hypercube, any nonnegative function f:{−1,1}n→ℝ+ exhibits a uniform tail bound that is better than that by Markov's inequality.
November 26, 2025 at 6:18 AM
I don’t know why people say there’s a global HBM Chips shortage. Any Korean can just walk into a local 7-Eleven and buy HBM Chips for about a dollar.

www.chosun.com/english/mark...
7-Eleven, SK Hynix Debut HBM Chips Snack
7-Eleven, SK Hynix Debut HBM Chips Snack Convenience store and semiconductor giant collaborate on HBM-inspired honey-banana snack with promotional stickers and prizes
www.chosun.com
November 26, 2025 at 5:57 AM
I support Senator Mark Kelly’s fight against Trump and Hegseth, but does he really need to text me asking for a donation?
November 26, 2025 at 2:12 AM
Understanding the Limitations of Diffusion LLMs through a Probabilistic Perspective by Cunxiao Du, Xinyu Yang, Min Lin, Chao Du and the team

It tries to answer this question - As probabilistic models for language data, how do Diffusion LLMs differ from AR LLMs when fitting the natural language?
Understanding the Limitations of Diffusion LLMs through a Probabilistic Perspective | Notion
Author: Cunxiao Du, Xinyu Yang, Min Lin, Chao Du and the team
www.notion.so
November 26, 2025 at 1:53 AM
If the rumor is true that Google may sell TPUs, not just offer them through Google Cloud, then it wouldn’t affect NVIDIA much in the near to medium term, but it would disrupt other GPU vendors (like AMD) and it would decimate many ASIC vendors (such as Cerebras, Groq, Tenstorrent, etc.).
I just need some clarification on these TPU rumors.

Are they implying that Google will sell TPUs, not just offer them through Google Cloud, to other companies like Meta?
November 26, 2025 at 1:23 AM
Continuous batching is the secret to why vLLM and transformers are fast.

"Continuous batching" by Remi Ouazan and two others.

huggingface.co/blog/continu...
November 26, 2025 at 12:52 AM
I’m confused about why he didn’t get the job. He answered the question correctly.
November 26, 2025 at 12:26 AM
Also, do you belong to two gyms so you don’t see the same people all the time? Or is it just me?
Exercising regularly has both benefits and drawbacks.

- Now I can easily bench two plates, squat three plates, and even do Nordic curls. Yea…
- But some of my most technical and "expensive" jackets fits me very very tight.
November 25, 2025 at 7:14 AM
Exercising regularly has both benefits and drawbacks.

- Now I can easily bench two plates, squat three plates, and even do Nordic curls. Yea…
- But some of my most technical and "expensive" jackets fits me very very tight.
November 25, 2025 at 7:09 AM
Salesforce's xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning

An intelligent LLM routing system trained with reinforcement learning to dynamically select optimal models from 20+ available LLMs while optimizing for both performance and cost.
November 25, 2025 at 6:32 AM
Tencent's HunyuanOCR

An expert, end-to-end OCR model built on Hunyuan's native multimodal architecture and training strategy. This model "supposed to" achieve SOTA performance with only 1 billion parameters, significantly reducing deployment costs.
November 25, 2025 at 6:29 AM
For more than a decade, Mountain Hardwear’s Ghost Whisperer has been the gold standard in lightweight down jackets and hoodies, but I think they’re really ugly and I own the jacket, the hoodie, and even the pants.

The pants are next-level ugly.
November 25, 2025 at 5:00 AM
I just need some clarification on these TPU rumors.

Are they implying that Google will sell TPUs, not just offer them through Google Cloud, to other companies like Meta?
November 25, 2025 at 2:03 AM
An implementation of GPU Puzzles, originally written in Python ( github.com/srush/GPU-Pu... ) to C++, by alexine.

github.com/jalexine/gpu...
November 25, 2025 at 1:01 AM
Is a pixel-level autoregressive model the one model to rule vision?

Google DeepMind suggests that pixel-by-pixel autoregressive modeling may scale into a truly unified vision paradigm. Their study shows that as resolution increases, model size must grow much faster than the dataset,
November 25, 2025 at 12:31 AM
Microsoft's Fara-7B (Open-weight)

Their first agentic small language model for computer use. This experimental model includes robust safety measures to aid responsible deployment.

Blog: www.microsoft.com/en-us/resear...
Model: huggingface.co/microsoft/Fa...
Fara-7B: An efficient agentic small language model for computer use
Fara-7B is our first agentic small language model for computer use. This experimental model includes robust safety measures to aid responsible deployment. Despite its size, Fara-7B holds its own again...
www.microsoft.com
November 25, 2025 at 12:16 AM
Flattening ASTs (and Other Compiler Data Structures) by Adrian Sampson

www.cs.cornell.edu/~asampson/bl...
November 24, 2025 at 11:49 PM
Korean won going down is really confusing at a high level, because the export economy is booming thanks to AI, yet the won keeps depreciating…

Why? Because Korean retail investors keep buying U.S. equities. If I were them, I’d buy more SK Hynix, not U.S. stocks, but who else would buy meme stocks.
November 24, 2025 at 11:12 PM
New podcast from Dwarkesh Patel, dropping tomorrow.
November 24, 2025 at 10:47 PM
EGGROLL (Evolution Guided General Optimization via Low-rank Learning)

This project demonstrates integer-only training of a language model directly on the CPU, completely bypassing the need for GPUs, floating-point arithmetic, or heavy ML frameworks like PyTorch or JAX.
November 24, 2025 at 10:40 PM
Social media (well X/Twitter) after they turned on the light.
November 24, 2025 at 8:24 AM
An interesting observation about Korean couple in Korea.

In the subway, when only one seat becomes available for a couple, more often than not the man sits down while the woman remains standing. I observed this across all generations.
November 24, 2025 at 12:56 AM