Lightnews — Scholar-powered news

Leena C Vankadara

@leenacvankadara.bsky.social

260 followers 67 following 11 posts

Lecturer @GatsbyUCL; Previously Applied Scientist @AmazonResearch; PhD @MPI-IS @UniTuebingen

Posts Replies Media Videos

Leena C Vankadara

@leenacvankadara.bsky.social

This may explain the practical success of CE over MSE!

CE admits larger LRs → richer feature learning. MSE is restricted to Lazy regime.

Validation: Under µP (where both losses admit feature learning), performance gaps vanish. MSE even seems to have an edge at scale! (7/10)

December 3, 2025 at 5:37 PM

Leena C Vankadara

@leenacvankadara.bsky.social

At the edge of this regime (where η ∝ 1/√m), there exists a well-defined infinite-width limit where feature learning persists in all hidden layers.

This Feature Learning Limit closely matches the behavior of optimally tuned finite-width networks under CE loss. (6/10)

December 3, 2025 at 5:37 PM

Leena C Vankadara

@leenacvankadara.bsky.social

We resolve this via a fine-grained analysis of the regime previously considered unstable (and therefore uninteresting).

Under CE loss, we find this regime comprises two distinct sub-regimes: A Catastrophically Unstable Regime and A benign Controlled Divergence regime. (4/10)

December 3, 2025 at 5:37 PM

Leena C Vankadara

@leenacvankadara.bsky.social

We find this discrepancy persists even accounting for finite-width effects due to Catapult/EOS, Large Depth, Alignment Violations.

In fact, infinite-width alignment predictions hold robustly when measured with sufficient granularity.

So what explains this discrepancy? (3/10)

December 3, 2025 at 5:37 PM

Leena C Vankadara

@leenacvankadara.bsky.social

Most nets use He/Lecun init with single LR η. As width m→∞, theory says

η∈O(1/m)⟹Kernel; η∈ω(1/m)⟹Unstable.

Thus max stable LR∝1/m.

Practice violates this. Optimal LRs are larger (e.g.∝1/√m) & models admit feature learning; contradicts kernel predictions. Why? (2/10)

December 3, 2025 at 5:37 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news