Lightnews — Scholar-powered news

Reposted by Sam Earle

Ahmed Khalifa @amidos2006.bsky.social · Aug 28

We (me, @smearle.bsky.social, @togelius.bsky.social) are working more to understand the relation between human solutions and the AI solutions in the #PuzzleScript games.

Help us by playing #online #puzzlescript games. No personal data is collected. Links are in the subsequent posts.

Sam Earle @smearle.bsky.social · Aug 27

We introduce PuzzleJAX, a benchmark for reasoning and learning. 🧩💡🦎

PuzzleJAX compiles hundreds of existing grid-based PuzzleScript games to hardware-accelerated JAX environments, and allows researchers to define new tasks via PuzzleScript's concise rewrite rule-based DSL.

1 7 11

Sam Earle @smearle.bsky.social · Aug 27

Work w/ Graham Todd, Yuchen Li, @amidos2006.bsky.social, Muhammad Umair Nasir, @zehuajiang.bsky.social, Andrzej Banburski-Fahey, and @togelius.bsky.social. Thanks to @increpare.bsky.social for creating and maintaining PuzzleScript, and to the many designers who have created beautiful things with it.

1

Sam Earle @smearle.bsky.social · Aug 27

Much remains to be done. Can we lead LLMs to moments of creative discovery, allowing them to unlock trickier puzzles? Can we train robust RL players using curricula of levels/mechanics? Can we use feedback from diverse AI players to guide the synthesis of interesting new games?

1 1

Sam Earle @smearle.bsky.social · Aug 27

Now, we can begin see how AI players respond to these challenges. The picture may be starkly surprising to some: simple tree search finds most solutions rapidly, while RL falls prey to obvious local minima, and LLMs spin their wheels when faced with unfamiliar semantics.

1 2

Sam Earle @smearle.bsky.social · Aug 27

PuzzleScript games make for a great benchmark. Often, despite their mechanical simplicity, they elicit moments of insight in human players. Since 2013, casual and professional designers have brought considerable ingenuity to the language, generating a plethora of diverse games.

Three series of screenshots from different PuzzleScript games: LimeRick, Kettle, and Take Heart Lass.

1 2

Sam Earle @smearle.bsky.social · Aug 27

Paper: arxiv.org/abs/2508.16821
Code: github.com/smearle/scri...

PuzzleJAX is a faithful re-implementation of PuzzleScript (puzzlescript.net) capturing all of the engine's major features. It leverages the convolutional nature of rewrite rules to achieve major speedups in JAX.

A series of plots comparing the speed of the original PuzzleScript to PuzzleJAX in various games, when PuzzleJAX is run at different batch sizes (i.e. number of concurrent parallel environments). PuzzleJAX is faster, particularly at larger batch sizes.

1 1 2

Sam Earle @smearle.bsky.social · Aug 27

We introduce PuzzleJAX, a benchmark for reasoning and learning. 🧩💡🦎

PuzzleJAX compiles hundreds of existing grid-based PuzzleScript games to hardware-accelerated JAX environments, and allows researchers to define new tasks via PuzzleScript's concise rewrite rule-based DSL.

1 17 40