In the second oral paper (14:22 PM, Room 1.62),
@yysung.bsky.social is presenting: GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
x.com/YooYeonSung1...(Short version: quiz bowl, a dumb trivia game, shows humans' calibration > LLMs'.)