Jannis Born
banner
jannisblrn.bsky.social
Jannis Born
@jannisblrn.bsky.social
Research Scientist @IBM - AI for Scientific Discovery! Tech & sports enthusiast
In our upcoming #ICML2025 paper, we introduce the #NumberTokenLoss (NTL) to address this -- see the demo above! NTL is a regression-style loss computed at the token level—no extra regression head needed. We propose adding NTL on top of CE during LLM pretraining. Our experiments show: (see ⬇️ )
July 3, 2025 at 9:21 PM
#ICML Why are LLMs so powerful but still suck at math? 🤔 A key problem is cross-entropy loss: It is nominal-scale, so tokens are unordered. That makes sense for words, but not for numbers. For a "5" label, predicting “6” or “9” gives the same loss 😱 Yes, it's crazy! No, nobody has fixed this yet! ⬇️
July 3, 2025 at 9:21 PM
If you're @neuripsconf.bsky.social and into #OptimalTransport & bio, dont miss on Alice Driessen's spotlight talk on #ConditionalMongeGap for modeling CAR Response. Today #AIDrugX workshop!

Positive results on OOD perturbations -> accurate gene expression prediction. Paper: ibm.biz/carot-pre
December 15, 2024 at 9:29 PM
Full poster
December 14, 2024 at 10:48 PM
A new loss improves math capabilities in language models! The loss is model-agnostic and only requires to know which tokens represent numbers.
No computational overhead but better performance.
Poster today @NeurIPS - MathAI Workshop! Thx to collaborators from TUM AI!
Paper: arxiv.org/abs/2411.02083
December 14, 2024 at 10:31 PM