Martina Vilas
martinagvilas.bsky.social
Martina Vilas
@martinagvilas.bsky.social
Computer Science PhD student | AI interpretability | Vision + Language | Cogntive Science. Prev. intern @MicrosoftResearch.

https://martinagvilas.github.io/
Pinned
Hi BlueSky! 🦋 I’m a computer science PhD student with a background in cognitive neuroscience. Working at the intersection of these topics, my research focuses on reverse engineer the cognitive capacities of AI models 🧠💻

Some recent examples 👇
Reposted by Martina Vilas
When to call it quits in LLM reasoning? 🛑

‪Martina's internship project suggests trace monitoring metrics and classifiers that can detect when an LLM reasoning trace is going to fail in mid way. The approach saves up to 70% of token usage, and it even helps with increasing accuracy by 2%-3%.
Can we predict which reasoning paths will succeed before seeing the answer? 🤔

Our new paper (arxiv.org/abs/2510.10494) proposes latent-trajectory signals from LLMs' hidden states to identify high-quality reasoning, cutting inference costs by up to 70% while maintaining accuracy
Tracing the Traces: Latent Temporal Signals for Efficient and Accurate Reasoning
Reasoning models improve their problem-solving ability through inference-time scaling, allocating more compute via longer token budgets. Identifying which reasoning traces are likely to succeed remain...
arxiv.org
October 22, 2025 at 10:39 PM
Can we predict which reasoning paths will succeed before seeing the answer? 🤔

Our new paper (arxiv.org/abs/2510.10494) proposes latent-trajectory signals from LLMs' hidden states to identify high-quality reasoning, cutting inference costs by up to 70% while maintaining accuracy
Tracing the Traces: Latent Temporal Signals for Efficient and Accurate Reasoning
Reasoning models improve their problem-solving ability through inference-time scaling, allocating more compute via longer token budgets. Identifying which reasoning traces are likely to succeed remain...
arxiv.org
October 22, 2025 at 3:38 PM
Reposted by Martina Vilas
All Eureka inference-time scaling insights are now available here: www.microsoft.com/en-us/resear... It was fun sharing these and more together with Vidhisha Balachandran @vidhishab.bsky.social and Vibhav Vineet at #ICLR2025.
Eureka Inference-Time Scaling Insights: Where We Stand and What Lies Ahead - Microsoft Research
Understanding and measuring the potential of inference-time scaling for reasoning. The new Eureka study tests nine state-of-the-art models on eight diverse reasoning tasks.
www.microsoft.com
April 29, 2025 at 3:36 PM
Looking forward to presenting this work next week at #ICLR2025! DM me if you are attending and want to grab a coffee to discuss these topics 💫
I will be presenting this ✨ spotlight 💫 paper at #ICLR2025 with @martinagvilas.bsky.social. Come say hi if you're interested in DNN circuits, complexity and #interpretability

📆 Poster Session 4 (#530)
🕰️ Fri 25 Apr. 3:00-5:30 PM
📝 openreview.net/forum?id=Qog...
📊 iclr.cc/virtual/2025...
April 18, 2025 at 6:55 PM
December 5th our ML theory group at Cohere For AI is hosting @mathildepapillon.bsky.social to discuss their recent review arxiv.org/abs/2407.09468 on geometric/topological/algebraic ML.

Join us online 💫
December 2, 2024 at 1:14 PM
Reposted by Martina Vilas
I’m putting together a starter pack for researchers working on human-centered AI evaluation. Reply or DM me if you’d like to be added, or if you have suggestions! Thank you!

(It looks NLP-centric at the moment, but that’s due to the current limits of my own knowledge 🙈)

go.bsky.app/G3w9LpE
November 21, 2024 at 3:56 PM
Reposted by Martina Vilas
I tried to find everyone who works in the area but I certainly missed some folks so please lmk...
go.bsky.app/BYkRryU
November 23, 2024 at 5:11 AM
Reposted by Martina Vilas
Does anyone know of any feeds (or similar) for student internship opportunities in ML/CV/NLP?
November 22, 2024 at 7:19 AM
Reposted by Martina Vilas
I've found starter packs on NLP, vision, graphics, etc. But personally, I would love to know and hear from researchers working on vision-language. So, let me know if you'd like to join this starter pack, would be happy to add!

go.bsky.app/TENRRBb
November 19, 2024 at 9:56 PM
Reposted by Martina Vilas
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:

Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢

🧵⬇️
November 20, 2024 at 4:35 PM
Reposted by Martina Vilas
LLMs tend to match problem-solving strategies based on textual similarity rather than truly understanding the underlying principles of mathematical problems.

Paper: Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology
November 18, 2024 at 9:29 PM
Reposted by Martina Vilas
A starter pack of people working on interpretability / explainability of all kinds, using theoretical and/or empirical approaches.

Reply or DM if you want to be added, and help me reach others!

go.bsky.app/DZv6TSS
November 14, 2024 at 5:00 PM
Reposted by Martina Vilas
If you’re interested in mechanistic interpretability, I just found this starter pack and wanted to boost it (thanks for creating it @butanium.bsky.social !). Excited to have a mech interp community on bluesky 🎉

go.bsky.app/LisK3CP
November 19, 2024 at 12:28 AM
Reposted by Martina Vilas
I forgot from whom in my feed I got this from, but anyway, this network analyzer is crazy efficient. It gives you ideas for accounts to follow based on your own followees. I just added 50 accounts or so.

bsky-follow-finder.theo.io
Bluesky Network Analyzer
Find accounts that you don't follow (yet) but are followed by lots of accounts that you do follow.
bsky-follow-finder.theo.io
November 18, 2024 at 9:32 PM
Reposted by Martina Vilas
there are many smart speakers and thinkers around AI/ML and/or NLP. but i find almost everything to be kinda predictable by now, minor stylistic variations on the same story. who are some *interesting* speakers i should listen/read? i want things that may surprise or inspire me.
November 16, 2024 at 8:41 PM
Reposted by Martina Vilas
Any Latin Americans here working in Cognitive Science, very broadly construed? (Neuroscience, Psychology, Artificial Intelligence, Anthropology, Linguistics, Economics, Ethics, Philosophy, and more…)

I thought I’d create a starter pack but I could only find a handful of us. Say hi?
November 17, 2024 at 1:37 PM
Reposted by Martina Vilas
It is intuitive to observe some complex-looking model behavior (e.g., the classification of images of different animals using an abstract category) and infer an interesting capacity of the model (e.g., the ability to build rich representations that abstract away from particular animals).
November 17, 2024 at 2:34 PM
Hi BlueSky! 🦋 I’m a computer science PhD student with a background in cognitive neuroscience. Working at the intersection of these topics, my research focuses on reverse engineer the cognitive capacities of AI models 🧠💻

Some recent examples 👇
November 17, 2024 at 2:06 PM
Reposted by Martina Vilas
I made a starter pack with the people doing something related to Neurosymbolic AI that I could find.

Let me know if I missed you!
go.bsky.app/RMJ8q3i
November 11, 2024 at 3:27 PM
Reposted by Martina Vilas
New here? Interested in AI/ML? Check out these great starter packs!

AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS

You can also search all starter packs here: blueskydirectory.com/starter-pack...
November 9, 2024 at 9:13 AM