Lightnews — Scholar-powered news

Tom Everitt

@tom4everitt.bsky.social

1.1K followers 350 following 96 posts

AGI safety researcher at Google DeepMind, leading causalincentives.com Personal website: tomeveritt.se

Posts Media Videos Starter Packs

Pinned

Tom Everitt @tom4everitt.bsky.social · Apr 17

What if LLMs are sometimes capable of doing a task but don't try hard enough to do it?

In a new paper, we use subtasks to assess capabilities. Perhaps surprisingly, LLMs often fail to fully employ their capabilities, i.e. they are not fully *goal-directed* 🧵

arxiv.org/abs/2504.118...

1 1 10

Tom Everitt @tom4everitt.bsky.social · 6d

Interesting. Could the measure also be applied to the human, assessing changes to their empowerment over time?

1 2

Tom Everitt @tom4everitt.bsky.social · 6d

Interesting, does the method rely on being able to set different goals for the LLM?

Reposted by Tom Everitt

Toby Ord @tobyord.bsky.social · 13d

Evaluating the Infinite
🧵
My latest paper tries to solve a longstanding problem afflicting fields such as decision theory, economics, and ethics — the problem of infinities.
Let me explain a bit about what causes the problem and how my solution avoids it.
1/N
arxiv.org/abs/2509.19389

Evaluating the Infinite

I present a novel mathematical technique for dealing with the infinities arising from divergent sums and integrals. It assigns them fine-grained infinite values from the set of hyperreal numbers in a ...

arxiv.org

2 5 11

Tom Everitt @tom4everitt.bsky.social · 13d

Interesting. I recall Rich Sutton made a similar suggestion in the 3rd edition of his RL book, arguing we should optimize average reward rather than discount

Reposted by Tom Everitt

Edward Grefenstette @egrefen.bsky.social · Jul 21

Do you have a PhD (or equivalent) or will have one in the coming months (i.e. 2-3 months away from graduating)? Do you want to help build open-ended agents that help humans do humans things better, rather than replace them? We're hiring 1-2 Research Scientists! Check the 🧵👇

3 6 19

Reposted by Tom Everitt

Yoshua Bengio @yoshuabengio.bsky.social · Jul 10

digital-strategy.ec.europa.eu/en/policies/... The Code also has two other, separate Chapters (Copyright, Transparency). The Chapter I co-chaired (Safety & Security) is a compliance tool for the small number of frontier AI companies to whom the “Systemic Risk” obligations of the AI Act apply.
2/3

The General-Purpose AI Code of Practice

The Code of Practice helps industry comply with the AI Act legal obligations on safety, transparency and copyright of general-purpose AI models.

digital-strategy.ec.europa.eu

1 2 5

Reposted by Tom Everitt

vkrakovna.bsky.social @vkrakovna.bsky.social · Jul 8

As models advance, a key AI safety concern is deceptive alignment / "scheming" – where AI might covertly pursue unintended goals. Our paper "Evaluating Frontier Models for Stealth and Situational Awareness" assesses whether current models can scheme. arxiv.org/abs/2505.01420

1 1 5

Reposted by Tom Everitt

Csaba Szepesvari @skiandsolve.bsky.social · Jul 8

First position paper I ever wrote. "Beyond Statistical Learning: Exact Learning Is Essential for General Intelligence" arxiv.org/abs/2506.23908 Background: I'd like LLMs to help me do math, but statistical learning seems inadequate to make this happen. What do you all think?

Beyond Statistical Learning: Exact Learning Is Essential for General Intelligence

Sound deductive reasoning -- the ability to derive new knowledge from existing facts and rules -- is an indisputably desirable aspect of general intelligence. Despite the major advances of AI systems ...

arxiv.org

4 9 51

Reposted by Tom Everitt

David Lindner @davidlindner.bsky.social · Jul 4

Can frontier models hide secret information and reasoning in their outputs?

We find early signs of steganographic capabilities in current frontier models, including Claude, GPT, and Gemini. 🧵

1 1 6

Tom Everitt @tom4everitt.bsky.social · Jun 27

This is an interesting explanation. But surely boys falling behind is nevertheless an important and underrated problem?

Tom Everitt @tom4everitt.bsky.social · Jun 10

Interesting. But is case 2 *real* introspection? It infers its internal temperature based on its external output, which feels more like inference based on exospection rather than proper introspection. (I know human "intro"spection often works like this too, but still)

Tom Everitt @tom4everitt.bsky.social · Jun 7

Thought provoking

Anil Seth @anilseth.bsky.social · Jun 7

1/ Can AI be conscious? My Behavioral & Brain Sciences target article on ‘Conscious AI and Biological Naturalism’ is now open for commentary proposals. Deadline is June 12. Take-home: real artificial consciousness is very unlikely along current trajectories. www.cambridge.org/core/journal...

Call for Commentary Proposals - Conscious artificial intelligence and biological

Call for Commentary Proposals - Conscious artificial intelligence and biological naturalism

www.cambridge.org

1 7