Han Bao
han-b.bsky.social
Han Bao
@han-b.bsky.social
Associate Professor@The Institute of Statistical Mathematics, working in ML theory
https://hermite.jp/
Recently I'm getting more and more unsure about how many research projects I'm actively involved in, and it turns out to be 10 in total after writing down! All of them highly excite me equally but the only thing is the limited time 🤯
December 22, 2025 at 11:39 PM
Reposted by Han Bao
The list of accepted papers at the Algorithmic Learning Theory Conference, a.k.a. #ALT2026, is out! h/t @thejonullman.bsky.social (PC chair).

algorithmiclearningtheory.org/alt2026/acce...

"ALT: topics so hot, it has to be held in Canada in February"
Accepted Papers | ALT 2026
algorithmiclearningtheory.org
December 21, 2025 at 10:24 PM
J'ai réussi A2🥳
December 16, 2025 at 7:28 AM
Reposted by Han Bao
Openreview opened the door to continuous and major revisions that nobody has time to check properly.
I think that we should come back to short one pdf page replies to reviews. It would mean having decisions quicker so that we actually have time to work on papers before resubmitting them.
December 12, 2025 at 6:55 AM
The next #NeurIPS2025 poster this evening on the interaction between loss functions and gradient descent dynamics: See you soon at Poster No.3007😎
🧗‍♂️Why GD converges beyond [step size]<2/[smoothness]? We investigate loss functions and identify their *separation margin* is an important factor. Surprisingly Renyi 2-entropy yields super fast rate T=Ω(ε^{-1/3})!
arxiv.org/abs/2502.04889
December 4, 2025 at 5:19 PM
NeurIPS is so exciting as usual that I could have an endless research discussion at bars even after the poster session! Love to see not a few friends are obsessed with math questions yet seeing the light of day
December 4, 2025 at 5:12 PM
Reposted by Han Bao
Although I’ll unfortunately miss #NeurIPS2025, our spotlight work on Conv-FY losses will still be there. My excellent collaborator @han-b.bsky.social will be presenting at Exhibit Hall C,D,E, 3001. If you’re on site, feel free to stop by and chat with him!
Convolutional Fenchel-Young loss paper has been updated slightly, with more examples e.g. classification with rejection (with the great effort of
@caoyuzhou.bsky.social
). We're gonna present the spotlight poster at #NeurIPS2025, at 4:30pm-7:30pm on Dec 3. See you all soon😎
Folks, here's our latest work on convolutional Fenchel-Young losses! We show convex smooth losses can have a linear surrogate regret bound on discrete losses (e.g. 01 loss, Prec@k), and it's relies on inf-conv, which is the most beautiful proof I saw ever.
arxiv.org/abs/2505.09432
December 1, 2025 at 11:24 PM
Convolutional Fenchel-Young loss paper has been updated slightly, with more examples e.g. classification with rejection (with the great effort of
@caoyuzhou.bsky.social
). We're gonna present the spotlight poster at #NeurIPS2025, at 4:30pm-7:30pm on Dec 3. See you all soon😎
Folks, here's our latest work on convolutional Fenchel-Young losses! We show convex smooth losses can have a linear surrogate regret bound on discrete losses (e.g. 01 loss, Prec@k), and it's relies on inf-conv, which is the most beautiful proof I saw ever.
arxiv.org/abs/2505.09432
November 29, 2025 at 6:11 AM
A simple yet computational efficient approach to online RL:
By noting that Q-function can be written with a policy and partition, we can eliminate Z by taking the difference of Q. Thus, all we need is to regress to **difference of Q** (eq. (6)).

arxiv.org/abs/2410.04612
November 22, 2025 at 11:37 PM
In the past few weeks, I learned singular perturbation theory (mainly based on Jones (2006) dalab.unn.ru/SiteGorbanKa...). In short, it studies two-timescale dynamics, where x and y are the "fast" and "slow" variables. Fenichel's theorem ensures the "slow manifold" exists for finitely small eps.
November 21, 2025 at 12:14 AM
I didn’t notice that this is fully described by Strogatz’ textbook (Chap 9) under the name Lorenz map
2) Metastability of Lorenz attractor
Metastability can even seen in the classical Lorentz system. Here's ρ=23.2. Yorke & Yorke (1979) fit least squares to the so-called "peak return map", and (semi-)analytically derived the transient time.

link.springer.com/content/pdf/...
November 11, 2025 at 6:55 AM
Le week-end derniere, je suis allé au Chateau Takeda, qui est populaire pour ses nuages. Même s‘il pleut, on pourrait voir des beaux nuages.
November 11, 2025 at 12:25 AM
Shiga University has the best campus environment across Japanese universities. This is today’s morning view while walking to the campus from a train station.

(Disclaimer: the campus is behind the castle😂)
October 23, 2025 at 1:53 AM
Yesterday I heard an opinion such that “informatics” is merely a suffix (like bioinformatics) and doesn’t sound like a principled field, which was shocking to me but makes sense to some extent. Why don’t we have a nice name like physics and mathematics…🤯
October 19, 2025 at 10:28 AM
French speaker bluesky: I have a question on l’informatique. In Japanese, literal translation of computer science is 計算機科学, but it’s not natural, and people use 情報科学/工学 (literally information science/engineering) as department and discipline names. I know French uses l’informatique, which is (1/n)
October 19, 2025 at 10:20 AM
This paper studies why Adam occasionally causes loss spikes, which is attributed to the edge of stability phenomenon. As seen from the figure, once hitting EOS (see b) a loss spike is triggered. An interesting experimental report!

arxiv.org/abs/2506.04805
October 10, 2025 at 7:55 AM
A nice paper in ICML2025, generalizing the smoothness notion, for which GD convergence is provided. The convergence proof is very transparent: descent lemma + telescoping + estimation of stepsize.
arxiv.org/abs/2412.11773
October 6, 2025 at 10:47 PM
Yesterday I learned from a pharmacologist that an integration of surrogate models and pharmacokinetics (PK) has been emerging. For example this (jpharmsci.org/article/S002... ). I just wonder if we can refine it by PINN-like approach.
Prediction of Human Pharmacokinetics From Chemical Structure: Combining Mechanistic Modeling with Machine Learning
Pharmacokinetics (PK) is the result of a complex interplay between compound properties and physiology, and a detailed characterization of a molecule's PK during preclinical research is key to understa...
jpharmsci.org
September 26, 2025 at 7:09 AM
Reposted by Han Bao
Thrilled to share our paper is accepted to #NeurIPS2025 as a Spotlight! 🎉 Big thanks to my awesome collaborators and the program committee. See you in San Diego!
Folks, here's our latest work on convolutional Fenchel-Young losses! We show convex smooth losses can have a linear surrogate regret bound on discrete losses (e.g. 01 loss, Prec@k), and it's relies on inf-conv, which is the most beautiful proof I saw ever.
arxiv.org/abs/2505.09432
September 22, 2025 at 5:19 AM
Why don't we implement ACL Findings type publication models to the ML venues? I don't see any outstanding reason to be reluctant. I know borderline papers sometimes do not provide "surprising" insights, but this ought to be judged by the test of time.
September 20, 2025 at 6:20 AM
Reposted by Han Bao
This is the metareview
September 18, 2025 at 6:50 PM
Reposted by Han Bao
This is not OK.

I don't submit often to NeurIPS, but I reviewed papers for this conference almost every year. As a reviewer, why would I spend time trying to give a fair opinion on papers if it's what happens in the end???
This is the metareview
September 20, 2025 at 6:10 AM
Je suis tellement content que mon papier est accepté à JMLR après plus d’un an! C’était très long…
September 18, 2025 at 12:16 PM
In this weekend I attended a workshop among mathematicians, where I was discussing dynamical systems. During the workshop, there are numerous novel insights to me, which I want to briefly share.

1) metastability in Allen-Cahn equation

(fig based on people.maths.ox.ac.uk/trefethen/pd...)
September 15, 2025 at 5:22 AM
Today's my favorites: "Clustering with Bregman Divergences: an Asymptotic Analysis" (Liu & Belkin, 2016)
proceedings.neurips.cc/paper_files/...

It's concerned with the limiting distribution of k-means (or Bregman) centroids (w/ n, k -> \infty). This is an escort distribution! (maybe overlooked)
September 2, 2025 at 10:44 PM