Anil Batra
abatra.bsky.social
Anil Batra
@abatra.bsky.social
PhD student @ University of Edinburgh | Looking for Posdoc Opportunity | Interested in Planning, Reasoning, Long Context | Multimodal AI | 🔗 https://anilbatra2185.github.io/
Reposted by Anil Batra
Excited to (virtually) present a new evaluation metric for VLMs: "CAST: Cross-modal Alignment Similarity Test for Vision Language Models" at COLING 2025!

Paper: arxiv.org/abs/2409.11007
CAST: Cross-modal Alignment Similarity Test for Vision Language Models
Vision Language Models (VLMs) are typically evaluated with Visual Question Answering (VQA) tasks which assess a model's understanding of scenes. Good VQA performance is taken as evidence that the mode...
arxiv.org
January 15, 2025 at 11:04 AM