Andreas Madsen
@andreasmadsen.bsky.social
320 followers 170 following 10 posts
Ph.D. in NLP Interpretability from Mila. Previously: independent researcher, freelancer in ML, and Node.js core developer.
Posts Media Videos Starter Packs
andreasmadsen.bsky.social
Also thanks to @sarath-chandar.bsky.social and @sivareddyg.bsky.social for supporting me during my Ph.D., which helped me get this far! I would highly recommend them if you are looking for a Ph.D. supervisor.
andreasmadsen.bsky.social
Positions:
* Full-stack
* Research Engineer
* Research Scientist
* Systems Infrastructure Engineer
* Research intern
Feel free to reach out but chances are I will see your application if you apply online. I will post details on my internship later, but there are more openings.
andreasmadsen.bsky.social
Excited to finally announce that I have joined @guidelabs.bsky.social. We are building LLMs from scratch designed to be interpretable. Many have asked what I'm doing after my Ph.D., so great to finally get it out. We have a lot of open positions, from engineering to scientist to intern.
andreasmadsen.bsky.social
All investigations of faithfulness show that explanations' faithfulness is by default model and task-dependent. However, this is not the case when using FMMs. Thus, presenting a new paradigm for how to provide and ensure faithful explanations.
andreasmadsen.bsky.social
FMMs are when models are designed such that measuring faithfulness is cheap and precise, which makes it possible to optimize explanations toward maximum faithfulness.
Diagram of faithfulness measurable models. Showing the model is designed to measure the faithfulness of an explanation, and that this can be used to optimize an explanation.
andreasmadsen.bsky.social
Self-explanations are when LLMs explain themselves. Current models are not capable of this, but we suggest how that could be changed.Diagram of self-explanations. Showing input going in, then the regular output and explanation going out.
Diagram of self-explanations. Showing input going in, then the regular output and explanation going out.
andreasmadsen.bsky.social
We ask the question: How to provide and ensure faithful explanations for general-purpose NLP models? The main thesis is that we should develop new paradigms in interpretability. The two new paradigms explored are faithfulness measurable models (FMMs) and self-explanations.
andreasmadsen.bsky.social
I’m thrilled to share that I’ve finished my Ph.D. at Mila and Polytechnique Montreal. For the last 4.5 years, I have worked on creating new faithfulness-centric paradigms for NLP Interpretability. Read my vision for the future of interpretability in our new position paper: arxiv.org/abs/2405.05386
Interpretability Needs a New Paradigm
Interpretability is the study of explaining models in understandable terms to humans. At present, interpretability is divided into two paradigms: the intrinsic paradigm, which believes that only model...
arxiv.org
andreasmadsen.bsky.social
Hi, can you add me thanks 🙂