Rahul G. Krishnan
@rahulgk.bsky.social
160 followers 71 following 23 posts
Assistant Professor at the University of Toronto ⚒️ 🏥 Deep learning and causal inference for computational medicine
Posts Media Videos Starter Packs
Reposted by Rahul G. Krishnan
vahidbalazadeh.bsky.social
🚨 Introducing CausalPFN, a foundation model trained on simulated data for in-context causal effect estimation, based on prior-fitted networks (PFNs). Joint work with Hamid Kamkari, Layer6AI & @rahulgk.bsky.social 🧵[1/7]

📝 arxiv.org/abs/2506.07918
🔗 github.com/vdblm/Causal...
🗣️Oral@ICML SIM workshop
rahulgk.bsky.social
Theres lots more to do to understand CFT better, and build on it to create better post-training methods to fine-tune large language models.

Reach out to me or Ethan if you're interested in collaborating on this or pushing this idea to new domains and problems!
rahulgk.bsky.social
📖 We’ve also open-sourced OpenMedText, integrating 121K biomedical articles & 29 medical textbooks to push future research in domain-adaptive fine-tuning in biomedicine.
rahulgk.bsky.social
🔧 We "negative" and "adaptive" prompts, confirming that the semantic content of prompts changes and impacts fine-tuning effectiveness.
rahulgk.bsky.social
📊 Results: On medical benchmarks, CFT improves accuracy by ~2.25% over CPT; in finance, it boosts performance by ~4.32%! Importantly, these gains scale effectively with larger models. 📈

Check out Appendix E.1 for preliminary results on GEMINI Flash 1.5M!
rahulgk.bsky.social
🏥 We tested this idea in biomedical (using newly curated OpenMedText dataset of journals & textbooks!) and financial data—CFT significantly outperforms continued pretraining (CPT) and instruction fine-tuning (IFT) in zero-shot settings.
rahulgk.bsky.social
🎓 Instead of using Q&A as in instruction tuning, CFT uses reflective instructions (e.g., "Reflect on how what you will see changes what you know...") motivated by how humans learn.
rahulgk.bsky.social
💡Contextual finetuning (CFT) uses contextual prompts during fine-tuning to adaptively change the semantic understanding that LLMs leverage during the process of learning new information.
rahulgk.bsky.social
🚀 Problem: Language models struggle with rapidly evolving info and context in fields like medicine & finance. We need ways to teach LLMs new information and control how they absorb this knowledge.

🔍 Insight: Why not explain and teach LLMs how to learn?
rahulgk.bsky.social
My student, Ethan Choi, will be at #ICLR2025 presenting Contextual Finetuning (CFT) and teaching LLMs how to learn (joint work with Muhammad Adil Asif, Ziwen Han, John Willes @vectorinstitute.ai)

🌟Project page: younwoochoi.github.io/cft-iclr/
#239, April 26 10-12:30(Hall3,2B)
https://younwoochoi.github.io/cft-iclr/🚀
rahulgk.bsky.social
If it helps, I usually learn something new (either directly or from further digging) about the behavior of markets.
Reposted by Rahul G. Krishnan
uoft-tcairem.bsky.social
📣T-CAIREM member @rahulgk.bsky.social's presentation is online! From Associational to Causal Predictions with #DeepLearning: An examination of recent advances in bridging the gap between associative #neuralnetworks and causal reasoning.
🎥 www.youtube.com/watch?v=yE6S...
Rahul G. Krishnan | From associational to causal predictions with deep learning
YouTube video by Schwartz Reisman Institute
www.youtube.com
rahulgk.bsky.social
Rocking that @ Gmail address!
rahulgk.bsky.social
Come by tomorrow to hear about what we have been up to!
torontosri.bsky.social
🚨 Tomorrow! 🚨

SRI Seminar Series welcomes @rahulgk.bsky.social assistant prof. at the @uoftcompsci.bsky.social. He’s a Canada Research Chair in Computational Medicine and an AI expert at the @vectorinstitute.ai

🗓️ January 29, 2025 | 12:30 - 2:00 PM ET

🔗 www.eventbrite.ca/e/sri-semina...
SRI Seminar Series featuring University of Toronto's Rahul G. Krishnan as guest speaker on January 29, 2025 at 12:30 PM ET.
rahulgk.bsky.social
I thought about this a bit, I think helping PhD students close the translational gap from research to deployment (in industry or their own startups), particularly if they don't want to go into academia, is one way forward.
rahulgk.bsky.social
o3 is incredible!

Since we've maxed out scale and $$$ on scaling inference-time compute I hope we now get back to thinking about the right combination of neural nets and algorithm to performant models cheaper, faster, and more reliably.
Reposted by Rahul G. Krishnan
projectavi.bsky.social
1/6
Presenting "Unlearning Tabular Data without a 'Forget Set'"! We explore a new unlearning algorithm RELOAD in tabular learning. Drop by @neuripsconf.bsky.social Workshop on Table Representation Learning (@trl-research.bsky.social):
- SAT 14 Dec from 2:30pm-3:15pm!
- East Meeting Room 11-12
rahulgk.bsky.social
Are you around at Neurips? Would love to say hi and catch up!
rahulgk.bsky.social
Come by our poster today to learn about decision making under unobserved confounding!
vahidbalazadeh.bsky.social
How can we use offline expert data with unobserved confounding to guide exploration in RL? Our approach is to learn prior distributions from expert data and follow posterior sampling

Come to our poster #NeurIPS2024 today to learn more!

🗓️ Thu 12 Dec 4:30 - 7 pm PST
📍 West Ballroom A-D #6708

(1/5)
rahulgk.bsky.social
Finally, if you're interested in understanding how to leverage energy-based normalizing flows, check out Lance's work on Meow (chienfeng-hub.github.io/meow/)

He'll be presenting on Dec. 12, 11:00 AM–2:00 PM at West Ballroom A-D #6403

🧵(7/7)
Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
chienfeng-hub.github.io
rahulgk.bsky.social
@nikitadhawan.bsky.social developed NATURAL (www.cs.toronto.edu/~nikita/natu...) with @cottascience.bsky.social , Karen & @cmaddis.bsky.social. Its an end-to-end pipeline that starts from raw-text data and ends with a causal (**) effect associated with an intervention.

(**) conditions apply
🧵(6/7)
NATURAL
www.cs.toronto.edu
rahulgk.bsky.social
b] ~Billions of dollars each year are spent on trials to assess interventions.

Can we use crowdsourced data to know which intervention is likely to work ahead of time?

Doing so requires answering a causal question!

But the data to answer this question is locked in unstructured text.

🧵(5/7)
rahulgk.bsky.social
Find Vahid to learn more about in-context causal inference and lots of other cool problems that he spends his time thinking about!

🧵(4/7)
rahulgk.bsky.social
In arxiv.org/abs/2404.07266, Vahid shows how to use offline expert data with unobserved confounding to guide decision making using a nonparametric prior to guide learning policies for bandits, MDPs, and POMDPs.

Thu 12 Dec 4:30 - 7:30 pm PST 📷 West Ballroom A-D Poster #6708

🧵(3/7)
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
We study the problem of online sequential decision-making given auxiliary demonstrations from experts who made their decisions based on unobserved contextual information. These demonstrations can be v...
arxiv.org
rahulgk.bsky.social
a] Today, we learn from data and treat it as ground truth -- should we?

A doctor often knows more about their patient than is represented in electronic medical records.

A teacher knows more about their students than what their grades suggest.

🧵(2/7)