Lightnews — Scholar-powered news

@utherwayn.bsky.social

17 followers 53 following 2 posts

Just a bunny loving game developer

Posts Replies Media Videos

utherwayn.bsky.social

@utherwayn.bsky.social

@simonwillison.net I'm not trying to be an LLM denier here, but man this paper hit home for me as not an ML kind of person and I'd love to see your take on it?

[2506.21521] Potemkin Understanding in Large Language Models share.google/W9cKIwYoWI5W...

Coherence seems like an important metric.

Potemkin Understanding in Large Language Models

Large language models (LLMs) are regularly evaluated using benchmark datasets. But what justifies making inferences about an LLM's capabilities based on its answers to a curated set of questions? This...

share.google

June 28, 2025 at 7:20 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news