#PaperAsk
New Preprint: #PaperAsk: A Benchmark for Reliability Evaluation of #LLMs in Paper Search and Reading (via #arXiv) arxiv.org/abs/2510.22242 #scholcomm #AI #discovery #retrieval
October 28, 2025 at 8:13 PM Everybody can reply
2 likes 1 saves
Yutao Wu, Xiao Liu, Yunhao Feng, Jiale Ding, Xingjun Ma
PaperAsk: A Benchmark for Reliability Evaluation of LLMs in Paper Search and Reading
https://arxiv.org/abs/2510.22242
October 28, 2025 at 5:53 AM Everybody can reply
Yutao Wu, Xiao Liu, Yunhao Feng, Jiale Ding, Xingjun Ma: PaperAsk: A Benchmark for Reliability Evaluation of LLMs in Paper Search and Reading https://arxiv.org/abs/2510.22242 https://arxiv.org/pdf/2510.22242 https://arxiv.org/html/2510.22242
October 28, 2025 at 6:32 AM Everybody can reply
2 reposts
PaperAsk: A Benchmark for Reliability Evaluation of LLMs in Paper Search and Reading | arXiv prepint
Test de fiabilité d' #AIGenerative
#LLM pour la recherche de publications scientifiques
Guess what ? "missing over 60% of relevant literature"
PaperAsk: A Benchmark for Reliability Evaluation of LLMs in Paper Search and Reading
Large Language Models (LLMs) increasingly serve as research assistants, yet their reliability in scholarly tasks remains under-evaluated. In this work, we introduce PaperAsk, a benchmark that systemat...
arxiv.org
October 29, 2025 at 5:32 AM Everybody can reply
1 reposts 5 likes 3 saves