Lightnews — Scholar-powered news

Aaron Roth

@aaroth.bsky.social

4.1K followers 380 following 340 posts

Professor at Penn, Amazon Scholar at AWS. Interested in machine learning, uncertainty quantification, game theory, privacy, fairness, and most of the intersections therein

Posts Media Videos Starter Packs

Pinned

Aaron Roth @aaroth.bsky.social · 19d

Aligning an AI with human preferences might be hard. But there is more than one AI out there, and users can choose which to use. Can we get the benefits of a fully aligned AI without solving the alignment problem? In a new paper we study a setting in which the answer is yes.

Aaron Roth @aaroth.bsky.social · 22h

If you happen to be at Yale on Thursday at noon, come by my talk and hear about the work we have been doing studying human/AI collaboration from the perspective of agreement and alignment over the last year

Aaron Roth @aaroth.bsky.social · 1d

isps.yale.edu/events/2025/...

“Agreement and Alignment for Human-AI Collaboration,” Aaron Roth, UPenn | Institution for Social and Policy Studies

Aaron Roth @aaroth.bsky.social · 1d

Interdisciplinary conferences are great, but FORC is proudly "disciplinary" in that is focused on work whose primary methodological tool is mathematics and well defined, formal models. Its a great venue to share work with a community of people working with the same toolkit.

Aaron Roth @aaroth.bsky.social · 1d

The FORC 2026 call for papers is out! responsiblecomputing.org/forc-2026-ca... Two reviewing cycles with two deadlines: Nov 11 and Feb 17. If you haven't been, FORC is a great venue for theoretical work in "responsible AI" --- fairness, privacy, social choice, CS&Law, explainability, etc.

FORC 2026: Call for Papers

The 7th annual Symposium on Foundations of Responsible Computing (FORC) will be held on June 3-5, 2026 at Harvard University. Brief summary for those who are familiar with past editions (prior to 2…

responsiblecomputing.org

Aaron Roth @aaroth.bsky.social · 1d

Well ChatGPT is only 3.

Reposted by Aaron Roth

Aaron Roth @aaroth.bsky.social · 3d

One more thought: AI tools are a very useful research accelerator for an expert, and I plan to use them whenever I can. But at the moment it is very easy to be led down false paths if you let them get ahead of yourself and lure you too far from your expertise.

Aaron Roth @aaroth.bsky.social · 3d

There will be answers but It might require new pedagogical habits.

Aaron Roth @aaroth.bsky.social · 3d

Educating the next generation of researchers in the presence of these tools will be interesting. How do we teach students to be sufficiently expert that they can use the tools productively in the presence of these tools? How does one learn expertise?

Aaron Roth @aaroth.bsky.social · 3d

One more thought: AI tools are a very useful research accelerator for an expert, and I plan to use them whenever I can. But at the moment it is very easy to be led down false paths if you let them get ahead of yourself and lure you too far from your expertise.

Aaron Roth @aaroth.bsky.social · 3d

Yes, this is spot on. At the moment it definitely is super useful to me, but it is easy to be taken down rabbit holes and end up with garbage if you use it too far outside of your expertise. Educating new researchers in the presence of these tools will be an adventure.

Aaron Roth @aaroth.bsky.social · 3d

It should go without saying that this is already the biggest change in the manner in which I do research in my career so far.

Aaron Roth @aaroth.bsky.social · 3d

This was a successful enough experiment that I think it will be my standard workflow going forward --- though I'm cognizant that the tools are improving all the time and I'll need to be ready to adopt new ones as things change.

Aaron Roth @aaroth.bsky.social · 3d

We also did experiments which we wouldn't normally do (and are unusual for a SODA paper) --- but they were super easy since we were already working in Windsurf. You can just describe the experiments in english and it already has the context of the setup from the paper.

Aaron Roth @aaroth.bsky.social · 3d

Finally once we had a first draft we moved to overleaf, took a pass over the mathematical sections by hand (checking correctness again, but also organizing things in ways that made more sense), and finally wrote all the exposition (intro, etc) by hand.

Aaron Roth @aaroth.bsky.social · 3d

e.g. we tried to get it to improve the lower bound (which doesn't match the upper bound). I didn't have a construction in mind. It would suggest plausible, long arguments, but they all turned out to have errors. We wasted a good amount of time on that.

Aaron Roth @aaroth.bsky.social · 3d

Mostly we had the theorems in our head already, but sometimes Gemini would prove a lemma in a surprising way, simplify an argument, or improve some bounds. When we tried to get it to prove stuff that we didn't know how to prove it would often take us down rabbit holes.

Aaron Roth @aaroth.bsky.social · 3d

We would verify correctness of each intermediate step before moving on. Generally if we gave it the kind of high level argument a colleague would understand, it would provide a formal proof without error. When it did make errors they were human like (e.g. Jensen the wrong way)

Aaron Roth @aaroth.bsky.social · 3d

It would then expand these informal arguments into formal lemma and theorem statements, directly in the tex documents we were working in. It would provide proofs. It had access to all of the tex docs for the paper which was useful for keeping it notationally consistent, etc.

Aaron Roth @aaroth.bsky.social · 3d

The first draft of all of the math (prelims, lemmas, theorems) was all done in the Windsurf environment, using Gemini 2.5 Pro. We would describe mathematical arguments to it as we would to a colleague without a whiteboard: intuitions and high level arguments, but no calculations.

Aaron Roth @aaroth.bsky.social · 3d

To contribute to common knowledge :-) I too have been integrating AI tools closely into my research. This paper arxiv.org/abs/2507.09683 with @mkearnsphilly.bsky.social
and Emily was my first experiment with this, and it was just accepted to SODA. I'll describe the process in thread.

Networked Information Aggregation via Machine Learning

We study a distributed learning problem in which learning agents are embedded in a directed acyclic graph (DAG). There is a fixed and arbitrary distribution over feature/label pairs, and each agent or...

Reposted by Aaron Roth

Natalie Collina @ncollina.bsky.social · 4d

Appearing in SODA 2026! Last year we had a 3-page SODA paper, this one is 107 pages. Next time I’m thinking we swing way back the other way and just submit a twitter/bluesky thread

Aaron Roth @aaroth.bsky.social · Apr 9

Suppose you and I both have different features about the same instance. Maybe I have CT scans and you have physician notes. We'd like to collaborate to make predictions that are more accurate than possible from either feature set alone, while only having to train on our own data.

Aaron Roth @aaroth.bsky.social · 13d

(btw, the paper that @sbubeck.bsky.social links to looks very neat: arxiv.org/abs/2509.18383 )

Gödel Test: Can Large Language Models Solve Easy Conjectures?

Recent announcements from frontier AI model labs have highlighted strong results on high-school and undergraduate math competitions. Yet it remains unclear whether large language models can solve new,...

Aaron Roth @aaroth.bsky.social · 13d

The opportunities and risks of the entry of LLMs into mathematical research in one screenshot. I think it is clear that LLMs will make trained researchers more effective. But they will also lead to a flood of bad/wrong papers, and I'm not sure we have the tools to deal with this.

Aaron Roth @aaroth.bsky.social · 19d

The paper is here: arxiv.org/abs/2509.15090 and is joint work with the excellent @ncollina.bsky.social , @surbhigoel.bsky.social , Emily Ryu, and Mirah Shi.

Aaron Roth @aaroth.bsky.social · 19d

I'm excited about this line of work. We get clean results in a stylized setting, and there is much to do to bring these kinds of ideas closer to practice. But I think that ideas from market and mechanism design should have lots to say about the practical alignment problem too.