cyrfrench.bsky.social
@cyrfrench.bsky.social
Reposted
Have you ever looked at the impressive results that LLMs get on benchmarks and wondered if these results are everything they seem?

If you'd like to learn about how data leakage calls the results we see on LLM performance into question, check out my latest blog post.

t-redactyl.io/posts/2025-1...
Data leakage is a major issue when measuring LLM performance
Why data leakage and benchmark contamination distort LLM performance claims, from coding puzzles to the LM Arena and training data exhaustion.
t-redactyl.io
January 13, 2026 at 3:32 PM