Dan Roy
banner
roydanroy.bsky.social
Dan Roy
@roydanroy.bsky.social
Research Director, Founding Faculty, Canada CIFAR AI Chair @VectorInst.
Full Prof @UofT - Statistics and Computer Sci. (x-appt) danroy.org

I study assumption-free prediction and decision making under uncertainty, with inference emerging from optimality.
Tian and Karolina and team are at ICLR. Come say hi.
📣 The Journey Matters: Our #ICLR2025 paper shows how to pretrain sparse LLMs with half the size of dense LLMs while maintaining quality. We found that the average parameter count during sparse pre-training predicts quality, not final size. An MIT/Rice/Google/ISTA collab 🧵 1/N
April 21, 2025 at 1:00 PM
Curious. Didn’t know meta had a PPL team.
April 7, 2025 at 1:13 AM
I like to think about non-reasoning model responses as vibes.
April 7, 2025 at 12:50 AM
So who’s read the 2027 article? What do you think?
April 7, 2025 at 12:47 AM
Someone has suggested I check out bsky again. So I'm back looking around here. Notification list is kinda boring. So any good conversations going on? Perhaps about LLM/AI reasoning?
March 23, 2025 at 9:10 PM
Of course.
March 23, 2025 at 9:04 PM
Anyone else have the worry that a lot of LLM research is .... just bad psychology?
February 3, 2025 at 3:26 AM
And, to achieve the results in this paper, what was the most challenging part? Why had previous attempts fallen short? What was your key new insight?
February 3, 2025 at 3:25 AM
Very interesting. So, what was the biggest hole to fill, in terms of hypotheses?
February 2, 2025 at 5:19 PM
Reposted by Dan Roy
Okay, so just a few* thoughts (*this got longer as I wrote 😅….long thread)-
Having a lot of thoughts & feelings today as someone who’s worked both on FBs misinfo interventions (back when they were making investments) and Birdwatch (pre-Musk Community Notes) 💔. Debating if a thread is worth the inevitable headache 😓
January 8, 2025 at 12:40 PM
Acknowledgments.
January 8, 2025 at 4:56 PM
I got to ski Revelstoke this winter break.

Couple observations: the price of receiving 600 cm of snow by Jan 8 is that it is constantly snowing. Saw almost no sun the whole time and the peak was often in whiteout conditions (though North Bowl was always clear…).

See image for more.
January 8, 2025 at 4:56 PM
Multiple friends have likely lost their homes in Los Angeles. Can’t imagine how disorienting this would be. They had only minutes to flee and grab belongings.
January 8, 2025 at 4:55 PM
What are the key papers to read?
December 30, 2024 at 8:36 PM
OK. Practical question times. How are you adjusting your research given progress in reasoning style models? Also how are you adjusting the way you work?
December 22, 2024 at 7:39 AM
A $100,000,000 experiment is no longer "consequence" free. Ilya is saying "scaling is over", but this may simply be that the scaling "laws" (not laws) are no longer accurate. Also, those laws are tied to hyperparameter tunings.
December 15, 2024 at 1:08 PM
Sure some were empirical. Some were not.
December 14, 2024 at 9:27 PM
I'd say no in a sense. Xavier-He initialization was theoretical work. And that was absolutely critical.
December 14, 2024 at 1:47 AM
Pretraining is not done. It's just that theorists haven't told the hackers how to do it better.
ILYA: "PRETRAINING IS DONE. WE ARE NOW IN THE POST TRAINING ERA."
December 13, 2024 at 10:52 PM
Annoying. If it could be automatic, sure.
December 13, 2024 at 12:14 AM
I'd say wait then.
December 12, 2024 at 10:37 PM
That's part of the spec. I don't think this is too problematic. The example they give is problems in NP, where there is a polynomial time checker (i.e., a polytime EV), but generating an instance that passes the checker is hard in the worst case.
December 12, 2024 at 10:37 PM
Now that I've had a taste of X without post length limitations, I've got to say that it is quite annoying have to fit tweets into 256 characters here on bsky. On X, when they get to long, they go below the fold, and so you're still incentivized to make it short. Can't we have that here?
December 12, 2024 at 10:35 PM
Lottery ticket?
December 11, 2024 at 10:18 PM
@gkdziugaite.bsky.social. Works at GDM and Mila. Influential, technical work.
December 11, 2024 at 1:20 PM