Sardine Lab
banner
sardine-lab-it.bsky.social
Sardine Lab
@sardine-lab-it.bsky.social
SARDINE (Structure AwaRe moDelIng for Natural LanguagE) is a research group at Instituto de Telecomunicações and Instituto Superior Técnico, in Lisbon, Portugal, led by André Martins.
Reposted by Sardine Lab
As neural metrics are a pillar for #MT, being extensively used for evaluation but also improving translation, we'd want them to be fair.

🚨 Our #ACL2025 paper shows they consistently, unduly favor masculine-inflected translations, or gendered forms, over neutral ones.

arxiv.org/pdf/2410.10995
July 14, 2025 at 2:00 PM
New paper from
Manos Zaranis, @tozefarinhas.bsky.social
and other sardines!!🚀

Meet MF²: Movie Facts & Fibs: a new benchmark for long-movie understanding

This benchmark focuses on narrative understanding (key events, emotional arcs, causal chains) in long movies.

Paper: arxiv.org/abs/2506.06275
Movie Facts and Fibs (MF$^2$): A Benchmark for Long Movie Understanding
Despite recent progress in vision-language models (VLMs), holistic understanding of long-form video content remains a significant challenge, partly due to limitations in current benchmarks. Many focus...
arxiv.org
June 24, 2025 at 3:51 PM
Applications for the 2025 Lisbon Machine Learning Summer School (LxMLS) are open, with @andre-t-martins.bsky.social as one of the organizers.
LxMLS is a great opportunity to learn from top speakers and to interact with other students. You can apply for a scholarship.

Apply here:
lxmls.it.pt/2025/
LxMLS 2025 - The 15th Lisbon Machine Learning Summer School
lxmls.it.pt
February 28, 2025 at 3:35 PM
Reposted by Sardine Lab
📣 New paper alert! We released a new safety benchmark for VLMs with a core focus on test cases that become unsafe by combining text and images.

TL;DR: many modern VLMs are unsafe across various types of queries and languages.

arxiv.org/abs/2501.10057
huggingface.co/datasets/fel...
Today, we are releasing MSTS, a new Multimodal Safety Test Suite for vision-language models!

MSTS is exciting because it tests for safety risks *created by multimodality*. Each prompt consists of a text + image that *only in combination* reveal their full unsafe meaning.

🧵
January 22, 2025 at 1:43 PM
🎉 New paper by Saul Santos in collaboration with @tozefarinhas.bsky.social and @andre-t-martins.bsky.social!! 🎉

∞-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation

Paper: arxiv.org/abs/2501.19098
$\infty$-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation
Current video-language models struggle with long-video understanding due to limited context lengths and reliance on sparse frame subsampling, often leading to information loss. This paper introduces $...
arxiv.org
February 3, 2025 at 12:22 PM