johndang.me
Our recent cross-institutional work asks: Does the available evidence match the current level of attention?
📜 arxiv.org/abs/2412.01946
Our recent cross-institutional work asks: Does the available evidence match the current level of attention?
📜 arxiv.org/abs/2412.01946
As part of a massive cross-institutional collaboration:
🗽Find MMLU is heavily overfit to western culture
🔍 Professional annotation of cultural sensitivity data
🌍 Release improved Global-MMLU 42 languages
📜 Paper: arxiv.org/pdf/2412.03304
📂 Data: hf.co/datasets/Coh...
As part of a massive cross-institutional collaboration:
🗽Find MMLU is heavily overfit to western culture
🔍 Professional annotation of cultural sensitivity data
🌍 Release improved Global-MMLU 42 languages
📜 Paper: arxiv.org/pdf/2412.03304
📂 Data: hf.co/datasets/Coh...
Check out the report: arxiv.org/abs/2412.04261
Check out the report: arxiv.org/abs/2412.04261