Author | Lightnews

Dan Roy

@roydanroy.bsky.social

8.3K followers 550 following 150 posts

Research Director, Founding Faculty, Canada CIFAR AI Chair @VectorInst.
Full Prof @UofT - Statistics and Computer Sci. (x-appt) danroy.org

I study assumption-free prediction and decision making under uncertainty, with inference emerging from optimality.

Posts Replies Media Videos

Dan Roy

@roydanroy.bsky.social

Tian and Karolina and team are at ICLR. Come say hi.

Tian Jin @tjin.bsky.social · Apr 21

📣 The Journey Matters: Our #ICLR2025 paper shows how to pretrain sparse LLMs with half the size of dense LLMs while maintaining quality. We found that the average parameter count during sparse pre-training predicts quality, not final size. An MIT/Rice/Google/ISTA collab 🧵 1/N

April 21, 2025 at 1:00 PM

Dan Roy

@roydanroy.bsky.social

Curious. Didn’t know meta had a PPL team.

April 7, 2025 at 1:13 AM

Dan Roy

@roydanroy.bsky.social

I like to think about non-reasoning model responses as vibes.

April 7, 2025 at 12:50 AM

Dan Roy

@roydanroy.bsky.social

So who’s read the 2027 article? What do you think?

April 7, 2025 at 12:47 AM

Dan Roy

@roydanroy.bsky.social

Someone has suggested I check out bsky again. So I'm back looking around here. Notification list is kinda boring. So any good conversations going on? Perhaps about LLM/AI reasoning?

March 23, 2025 at 9:10 PM

Dan Roy

@roydanroy.bsky.social

Of course.

March 23, 2025 at 9:04 PM

Dan Roy

@roydanroy.bsky.social

Anyone else have the worry that a lot of LLM research is .... just bad psychology?

February 3, 2025 at 3:26 AM

Dan Roy

@roydanroy.bsky.social

And, to achieve the results in this paper, what was the most challenging part? Why had previous attempts fallen short? What was your key new insight?

February 3, 2025 at 3:25 AM

Dan Roy

@roydanroy.bsky.social

Very interesting. So, what was the biggest hole to fill, in terms of hypotheses?

February 2, 2025 at 5:19 PM

Reposted by Dan Roy

Mary Beth Hunzaker

@mbhunzaker.bsky.social

Okay, so just a few* thoughts (*this got longer as I wrote 😅….long thread)-

Mary Beth Hunzaker @mbhunzaker.bsky.social · Jan 7

Having a lot of thoughts & feelings today as someone who’s worked both on FBs misinfo interventions (back when they were making investments) and Birdwatch (pre-Musk Community Notes) 💔. Debating if a thread is worth the inevitable headache 😓

January 8, 2025 at 12:40 PM

Dan Roy

@roydanroy.bsky.social

Acknowledgments.

January 8, 2025 at 4:56 PM

Dan Roy

@roydanroy.bsky.social

I got to ski Revelstoke this winter break.

Couple observations: the price of receiving 600 cm of snow by Jan 8 is that it is constantly snowing. Saw almost no sun the whole time and the peak was often in whiteout conditions (though North Bowl was always clear…).

See image for more.

January 8, 2025 at 4:56 PM

Dan Roy

@roydanroy.bsky.social

Multiple friends have likely lost their homes in Los Angeles. Can’t imagine how disorienting this would be. They had only minutes to flee and grab belongings.

January 8, 2025 at 4:55 PM

Dan Roy

@roydanroy.bsky.social

What are the key papers to read?

December 30, 2024 at 8:36 PM

Dan Roy

@roydanroy.bsky.social

OK. Practical question times. How are you adjusting your research given progress in reasoning style models? Also how are you adjusting the way you work?

December 22, 2024 at 7:39 AM

Dan Roy

@roydanroy.bsky.social

A $100,000,000 experiment is no longer "consequence" free. Ilya is saying "scaling is over", but this may simply be that the scaling "laws" (not laws) are no longer accurate. Also, those laws are tied to hyperparameter tunings.

December 15, 2024 at 1:08 PM

Dan Roy

@roydanroy.bsky.social

Sure some were empirical. Some were not.

December 14, 2024 at 9:27 PM

Dan Roy

@roydanroy.bsky.social

I'd say no in a sense. Xavier-He initialization was theoretical work. And that was absolutely critical.

December 14, 2024 at 1:47 AM

Dan Roy

@roydanroy.bsky.social

Pretraining is not done. It's just that theorists haven't told the hackers how to do it better.

Nathan Lambert @natolambert.bsky.social · Dec 13

ILYA: "PRETRAINING IS DONE. WE ARE NOW IN THE POST TRAINING ERA."

December 13, 2024 at 10:52 PM

Dan Roy

@roydanroy.bsky.social

Annoying. If it could be automatic, sure.

December 13, 2024 at 12:14 AM

Dan Roy

@roydanroy.bsky.social

I'd say wait then.

December 12, 2024 at 10:37 PM

Dan Roy

@roydanroy.bsky.social

That's part of the spec. I don't think this is too problematic. The example they give is problems in NP, where there is a polynomial time checker (i.e., a polytime EV), but generating an instance that passes the checker is hard in the worst case.

December 12, 2024 at 10:37 PM

Dan Roy

@roydanroy.bsky.social

Now that I've had a taste of X without post length limitations, I've got to say that it is quite annoying have to fit tweets into 256 characters here on bsky. On X, when they get to long, they go below the fold, and so you're still incentivized to make it short. Can't we have that here?

December 12, 2024 at 10:35 PM

Dan Roy

@roydanroy.bsky.social

Lottery ticket?

December 11, 2024 at 10:18 PM

Dan Roy

@roydanroy.bsky.social

@gkdziugaite.bsky.social. Works at GDM and Mila. Influential, technical work.

December 11, 2024 at 1:20 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news