Posts
Media
Videos
Starter Packs
Reposted by Desmond Elliott
Philipp Mondorf
@pmondorf.bsky.social
· Jul 18
LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
There is an increasing trend towards evaluating NLP models with LLMs instead of human judgments, raising questions about the validity of these evaluations, as well as their reproducibility in the case...
doi.org
Reposted by Desmond Elliott
Reposted by Desmond Elliott
Reposted by Desmond Elliott
Reposted by Desmond Elliott
#ICCV2025
@iccv.bsky.social
· Jun 25
Desmond Elliott
@delliott.bsky.social
· Jun 22
Desmond Elliott
@delliott.bsky.social
· Jun 20
Reposted by Desmond Elliott
Desmond Elliott
@delliott.bsky.social
· Jun 11
Reposted by Desmond Elliott
Reposted by Desmond Elliott
Reposted by Desmond Elliott
Reposted by Desmond Elliott
Reposted by Desmond Elliott
Maria Antoniak
@mariaa.bsky.social
· May 9
Desmond Elliott
@delliott.bsky.social
· May 8
Reposted by Desmond Elliott
Andrew Lampinen
@lampinen.bsky.social
· May 1
Desmond Elliott
@delliott.bsky.social
· Apr 14
Reposted by Desmond Elliott
Reposted by Desmond Elliott