Jia-Bin Huang
@jbhuang0604.bsky.social
2.7K followers 32 following 180 posts
Associate Professor at UMD CS. YouTube: https://youtube.com/@jbhuang0604 Interested in how computers can learn and see.
Posts Media Videos Starter Packs
jbhuang0604.bsky.social
How AI Taught Itself to See

Self-supervised learning is fascinating! How can AI learn from images only without labels?

In this video, we’ll build the method from first principles and uncover the key ideas behind CLIP, MAE, SimCLR, and DINO (v1–v3).

Video link: youtu.be/oGTasd3cliM
How AI Taught Itself to See [DINOv3]
YouTube video by Jia-Bin Huang
youtu.be
jbhuang0604.bsky.social
New video!

A quick dive into the recent Hierarchical Reasoning Model (HRM) through the lens of algorithm synthesis.

Check it out: youtu.be/RK7lysjz_G0
The Weirdly Small AI That Cracks Reasoning Puzzles [HRM]
YouTube video by Jia-Bin Huang
youtu.be
jbhuang0604.bsky.social
Diffusion LLMs are promising ways to overcome the limitations of autoregressive LLMs.

Less error propagation, easier to control, and faster to sample!

But how do Diffusion LLMs actually work? 🤔

In this video, let's explore some ideas on this fascinating topic! youtu.be/8BTOoc0yDVA
jbhuang0604.bsky.social
In an era of billion-parameter models everywhere, it's incredibly refreshing to see how a fundamental question can be formulated and solved with simple, beautiful math.

- How should we orient a solar panel ☀️🔋? -

Zero AI! If you enjoy math, you'll love this!

Video: www.youtube.com/watch?v=ZKzL...
jbhuang0604.bsky.social
*Slides without slide titles*

When I first tried presenting WITHOUT slide titles, everything flowed so much better! (totally validated ... by me)!

Give it a shot! Once you try it, you’ll never want to go back.
jbhuang0604.bsky.social
*Empty initial slides*

What’s a better starting point than that default slide layout?

A completely blank slide.

It helps you explore the design space and focus on delivering a clear, compelling story.
jbhuang0604.bsky.social
*Bullet points*

The second thing the layout prompts you to do?
("Click to add text").

Start a bullet list.

Among so many creative forms of presenting your ideas, it nudges you toward the most boring one: a list. 🔢
jbhuang0604.bsky.social
*Slide title*

The first thing this layout does is to ask you to add a slide title.

Seems reasonable, right? visuals, this encourages you to
1) lead your presentation with text instead of visuals and
2) cram in many titles in a talk, making it harder to maintain a narrative flow.
jbhuang0604.bsky.social
Why is the "Title and Content" slide layout BAD?

Most people prepare their presentation from this default layout. I used it for years without questioning it.

BUT, this essentially guides you toward developing poor presentation. Why? 🤔
jbhuang0604.bsky.social
Thanks! Yup, I hope to cover some fun computer vision applications. Stay tuned!
jbhuang0604.bsky.social
Kids’ summer camp just kicked off, and that means...
I finally have time to make new videos!

What topics are you most interested in right now?
jbhuang0604.bsky.social
Why More Researchers Should be Content Creators

Just trying something new! I recorded one of my recent talks, sharing what I learned from starting as a small content creator.

youtu.be/0W_7tJtGcMI

We all benefit when there are more content creators!
Reposted by Jia-Bin Huang
csprofkgd.bsky.social
Fresh out of the oven! 🍞 @jbhuang0604.bsky.social breaks down Mean Flow from Kaiming’s group in his latest video.

Video: youtu.be/swKdn-qT47Q?...
jbhuang0604.bsky.social
Policy gradient methods rock!

These are the core techniques for making your transformer "chat" and "reason", a robot that manipulates objects, and a drone that maneuvers in a complex environment.

BUT, how do we learn all the developments in the past 30+ years?
jbhuang0604.bsky.social
Awesome! 🤩

So glad to hear the authors enjoyed the video, totally made my day!
jbhuang0604.bsky.social
We had a blast at CVPR2025!

There was so much to learn! I am particularly excited to meet many new friends and reconnect with old ones.

I feel energized. Already looking forward to the next one!
jbhuang0604.bsky.social
Kullback–Leibler (KL) divergence is a cornerstone of machine learning.

We use it everywhere, from training classifiers and distilling knowledge from models, to learning generative models and aligning LLMs.

BUT, what does it mean, and how do we (actually) compute it?

Video: youtu.be/tXE23653JrU
jbhuang0604.bsky.social
My X/Twitter account has been hacked... Please don't believe what they said!

Trying to get it back in the meantime. Sorry for the inconvenience!
jbhuang0604.bsky.social
How LLMs Learn to Reason with Reinforcement Learning

Full video: www.youtube.com/watch?v=mg-i...
jbhuang0604.bsky.social
Ha! Yes, Seungjae insisted that we call this IVE.
jbhuang0604.bsky.social
RL is so back!

Reinforcement learning is a key driver in aligning LLMs and enhancing their reasoning capabilities.

BUT, it’s a tricky topic to wrap your head around (at least for myself 😵‍💫).

So, I put up a video breaking down the basics in a way that clicked for me. I hope it helps you, too!
jbhuang0604.bsky.social
I find TRPO's idea of learning from others' experiences fascinating.

So, I started running TRPO for my group, making all (previously individual) feedback on experiments, writing, rebuttals, and presentations public.

Now everyone gets to learn from each other’s trajectories!