Lightnews — Scholar-powered news

Nicolas Beltran-Velez

@velezbeltran.bsky.social

1.8K followers 1K following 84 posts

Machine Learning PhD Student
@ Blei Lab & Columbia University.

Working on probabilistic ML | uncertainty quantification | LLM interpretability.

Excited about everything ML, AI and engineering!

Posts Replies Media Videos

Pinned

Nicolas Beltran-Velez @velezbeltran.bsky.social · Dec 2

I am very excited to share our new Neurips 2024 paper + package, Treeffuser! 🌳 We combine gradient-boosted trees with diffusion models for fast, flexible probabilistic predictions and well-calibrated uncertainty.

paper: arxiv.org/abs/2406.07658
repo: github.com/blei-lab/tre...

🧵(1/8)

Samples y | x from Treeffuser vs. true densities, for multiple values of x under three different scenarios. Treeffuser captures arbitrarily complex conditional distributions that vary with x.

Reposted by Nicolas Beltran-Velez

Irving Institute for Cancer Dynamics

@cancerdynamics.bsky.social

🎓 Hats off to the 2025 IICD graduates: Yining Ma Junze Huang Yichi Yang Ruilin Dai Boan Zhu Cameron Park @jlfan.bsky.social & Achille Nazaret!
Wishing you all the best in your next chapter — we’re proud of you! 💙 #Columbia2025
@bleilab.bsky.social @khanhndinh.bsky.social @elhamazizi.bsky.social

May 21, 2025 at 1:19 PM

Reposted by Nicolas Beltran-Velez

kyunghyuncho.bsky.social

@kyunghyuncho.bsky.social

this is probably not the complete picture of KD, but i can definitely sleep better after writing down and confirming this minimal working explanation.

arXiv: arxiv.org/abs/2505.13111

(3/4)

Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation

Knowledge distillation (KD) is a core component in the training and deployment of modern generative models, particularly large language models (LLMs). While its empirical benefits are well documented-...

arxiv.org

May 20, 2025 at 12:18 PM

Reposted by Nicolas Beltran-Velez

Freda Shi

@fredashi.bsky.social

I received a review like this five years ago. It’s probably the right time now to share it with everyone who wrote or got random discouraging reviews from ICML/ACL.

March 28, 2025 at 7:55 PM

Reposted by Nicolas Beltran-Velez

Nathan Lambert

@natolambert.bsky.social

First 11 chapters of RLHF Book have v0 draft done. Should be useful now.

Next:
* Crafting more blog content into future topics,
* DPO+ chapter,
* Meeting with publishers to get wheels turning on physical copies,
* Cleaning & cohesiveness
rlhfbook.com

February 26, 2025 at 4:35 PM

Reposted by Nicolas Beltran-Velez

briantrippe.bsky.social

@briantrippe.bsky.social

🔥 Benchmark Alert! MotifBench sets a new standard for evaluating protein design methods in motif scaffolding.
Why does this matter? Reproducibility & fair comparison have been lacking—until now.
Paper: arxiv.org/abs/2502.12479 | Repo: github.com/blt2114/Moti...
A thread ⬇️

February 19, 2025 at 8:50 PM

Reposted by Nicolas Beltran-Velez

Alexander Doria

@dorialexander.bsky.social

The HuggingFace/Nanotron team just shipped an entire pretraining textbook in interactive format. huggingface.co/spaces/nanot...

It’s not just a great pedagogic support, but many unprecedented data and experiments presented for the first time in a systematic way.

February 19, 2025 at 7:13 PM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

I just wanted to see what it looked like 😭

February 19, 2025 at 2:26 AM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

Good God, please. I just want some gradients that don't vanish 😭

February 17, 2025 at 3:01 AM

Reposted by Nicolas Beltran-Velez

Juan Diego Rodriguez

@juand-r.bsky.social

I was hoping that recent events would lead to a mass exodus from X. Many have left, but most of the ML and LLM people have not.

I have lost a lot of respect for the ML community.

February 5, 2025 at 5:58 AM

Reposted by Nicolas Beltran-Velez

lebellig

@lebellig.bsky.social

Now that bluesky has gifs (it didn't work?), I can share (again) my educational notebook on discrete flow matching (by Itai Gat et al.). Also please check the original article and official implementation by Meta!

🐍 github.com/gle-bellier/...
🐍 github.com/facebookrese...
📄 arxiv.org/abs/2407.15595

February 5, 2025 at 4:54 PM

Reposted by Nicolas Beltran-Velez

Christian A. Naesseth

@canaesseth.bsky.social

Really excited about this! We note a connection between diffusion/flow models and neural/latent SDEs. We show how to use this for simulation-free learning of fully flexible SDEs. We refer to this as SDE Matching and show speed improvements of several orders of magnitude.

arxiv.org/abs/2502.02472

SDE Matching: Scalable and Simulation-Free Training of Latent Stochastic Differential Equations

The Latent Stochastic Differential Equation (SDE) is a powerful tool for time series and sequence modeling. However, training Latent SDEs typically relies on adjoint sensitivity methods, which depend ...

arxiv.org

February 5, 2025 at 2:38 PM

Reposted by Nicolas Beltran-Velez

Ted Underwood

@tedunderwood.com

I have a sinking feeling that by 2029 I'm going to be faking a British accent so no one will think I was one of the *Americans* working on AI during the regime.

This is a scatterplot with the following key features:

Axes:
The x-axis represents "Interest in AI," with values ranging approximately from -2 to 2.
The y-axis represents "Willingness to Tolerate Closed, Autocratic Systems," also ranging from about -2 to 2.
Data Points:
Black dots dominate the plot, distributed across all four quadrants, indicating diverse positions on both variables.
A few red dots labeled "my peeps" are clustered in the bottom-right quadrant, signifying high interest in AI but low tolerance for closed, autocratic systems.
Blue Lines:
The plot includes horizontal and vertical blue lines at zero, dividing it into four quadrants for visual reference.
This visualization highlights a subset of individuals ("my peeps") who stand out from the majority based on their distinct combination of interest and values.

February 3, 2025 at 1:24 AM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

NGL, it's kind of surprising that more people haven't migrated here, especially given what Musk has been doing these days. I don't get it.

February 3, 2025 at 2:58 AM

Reposted by Nicolas Beltran-Velez

Nathan Lambert

@natolambert.bsky.social

Since everyone wants to learn RL for language models now post DeepSeek, reminder that I've been working on this book quietly in the background for months.

Policy gradient chapter is coming together. Plugging away at the book every day now.

rlhfbook.com/c/11-policy-...

February 1, 2025 at 10:05 PM

Reposted by Nicolas Beltran-Velez

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Please stop anthropomorphizing language models, it makes them feel really bad

January 29, 2025 at 11:20 PM

Reposted by Nicolas Beltran-Velez

Tim Onion

@bencollins.bsky.social

This comments section is the first time I've felt even a shred of hope in eight days.

From the fednews community on Reddit

Explore this post and more from the fednews community

www.reddit.com

January 29, 2025 at 5:41 AM

Reposted by Nicolas Beltran-Velez

Jeff Dean

@jeffdean.bsky.social

Nazi salutes and speaking at neo-Nazi rallies seems bad. There's history that we should learn from.

January 26, 2025 at 12:41 AM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

Something I really like about NLP research is that it makes everything super intuitive. This week I have been thinking about variational inference in NLP and a lot of the things that seemed to require mathematical intuition just become trivial when thinking about language. So cool:)

January 25, 2025 at 9:52 PM

Reposted by Nicolas Beltran-Velez

Ethan Mollick

@emollick.bsky.social

New randomized, controlled trial by the World Bank of students using GPT-4 as a tutor in Nigeria. Six weeks of after-school AI tutoring = 2 years of typical learning gains, outperforming 80% of other educational interventions.

And it helped all students, especially girls who were initially behind.

January 15, 2025 at 8:58 PM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

Does anyone have any good resources to learn about quantization? Any essential papers to read and resources about how to use/quantize models in practice are greatly appreciated!

December 28, 2024 at 4:51 PM

Reposted by Nicolas Beltran-Velez

Mark Riedl

@markriedl.bsky.social

1-> 2 -> 3 -> 3.5 -> 4 -> 4o -> o1 -> o3

I guess we need AGI just to figure out how to name things

December 20, 2024 at 7:17 PM

Reposted by Nicolas Beltran-Velez

Csaba Szepesvari

@skiandsolve.bsky.social

If you are into ML theory (RL or not) with a proven track record, and you are interested in an industry research position, PM me. Feel free to spread the word.

December 19, 2024 at 12:55 AM

Reposted by Nicolas Beltran-Velez

Joy Fan

@jlfan.bsky.social

🧵 Excited to share #Echidna, a Bayesian framework for quantifying the impact of gene dosage on phenotypic plasticity: tinyurl.com/296kf7hf!
With @elhamazizi.bsky.social and @mingxz.bsky.social, we integrate scRNA-seq & WGS to uncover how CNAs drive tumor evolution and transcriptional variability.

www.biorxiv.org

December 18, 2024 at 1:31 PM

Reposted by Nicolas Beltran-Velez

Elham Azizi

@elhamazizi.bsky.social

Proud of this work spearheaded by the phenomenal @jlfan.bsky.social and @mingxz.bsky.social in collaboration w/ Ben Izar! The past 3 years we've worked hard to unravel how #CNVs shape #tumor phenotypic plasticity seen in #singlecell #RNAseq data ➡️ #Echidna 🦔

December 18, 2024 at 2:08 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news