Lightnews — Scholar-powered news

UCL Discovery

@ucldiscovery.bsky.social

Open Access UCL Research: UiO-UvA at SemEval-2020 Task 1: Contextualised Embeddings for Lexical Semantic Change Detection discovery.ucl.ac.uk/id/eprint/10...

UiO-UvA at SemEval-2020 Task 1: Contextualised Embeddings for Lexical Semantic Change Detection - UCL Discovery

UCL Discovery is UCL's open access repository, showcasing and providing access to UCL research outputs from all UCL disciplines.

discovery.ucl.ac.uk

November 7, 2025 at 3:42 PM

Ramon Astudillo

@ramon-astudillo.bsky.social

My main dataset trajectory was TI-DIGITS, WSJ0, Aurora5, CHiME-1-3, Librispeech and a lot of speech and language ML competition data (SemEval, MediaEval, ComPARe), internal data and WMT(QE), back to Librispeech, AMR DARPA stuff, again internal and then excels and excels of LLM benchmarks

Ramon Astudillo @ramon-astudillo.bsky.social · 24d

Evolution of AI benchmarks is like:

Toy-Bench-1997
unreasonably-clean-data-Bench
5y-old-Bench
2y-old-Bench
Live-Bench (collected last year)
Alive-Bench (real world, real time user satisfaction stats)
Prediction-Market-Bench what's Geminis perf. EoY?

Ramon Astudillo @ramon-astudillo.bsky.social · 24d

This looks interesting but no Gemini?

www.aiengineering.report/p/claude-cod...

October 18, 2025 at 4:39 PM

Andreas Wagner

@anwagnerdreas.hcommons.social.ap.brid.gy

Shared Tasks at https://semeval.github.io/

- SemEval 2023 Task 12: AfriSenti https://afrisenti-semeval.github.io/
- SemEval 2024 Task 1: SemRel https://semantic-textual-relatedness.github.io/
- SemEval 2025 Task 11: Bridging the Gap https://github.com/emotion-analysis-project/SemEval2025-task11 […]

Original post on hcommons.social

hcommons.social

September 24, 2025 at 12:04 PM

Nedjma Ousidhoum

@nedjmaou-nlp.bsky.social

Pleased to announce our #NLP SemEval Task 7: Everyday Knowledge Across Diverse Languages & Cultures.
We extend the BLEnD Benchmark to >30 language-culture pairs. [Our task is Junior-friendly, with live Q&A & tutorials.] 1/

September 22, 2025 at 11:14 AM

Kilian Evang

@texttheater.mastodon.social.ap.brid.gy

“MWAHAHA, which stands for Models Write Automatic Humor And Humans Annotate, is SemEval 2026's Task 1 and is the first task dedicated to advancing the state of the art in Computational Humor Generation.”

MWAHAHA: A Competition on Humor Generation […]

Original post on mastodon.social

mastodon.social

September 8, 2025 at 1:06 PM

Deutsche Nationalbibliothek

@dnb-aktuelles.bsky.social

Best Paper Award für die Publikation „DNB-AI-Project at SemEval-2025 Task 5: An LLM-Ensemble Approach for Automated Subject Indexing": blog.dnb.de/ki-projekt-g...

📸 DNB, Maximilian Kähler, Lisa Kluge CC BY 4.0

Visualisierung der Schritte Complete und Map

September 2, 2025 at 11:21 AM

Finto-palvelu

@finto.some.kansalliskirjasto.fi.ap.brid.gy

Earlier this year, the Annif team participated in the LLMs4Subjects challenge, where our automated indexing tool performed nicely! 🏆 We also got new ideas for Annif development out of the challenge! The SemEval-2025 workshop proceedings are now available 👉 […]

Original post on some.kansalliskirjasto.fi

some.kansalliskirjasto.fi

August 28, 2025 at 10:58 AM

Finto-palvelu

@finto.some.kansalliskirjasto.fi.ap.brid.gy

Annif-tiimi osallistui alkuvuodesta LLMs4Subjects-haasteeseen, jossa automaattisen asiasanoituksen työkalumme pärjäsi hienosti! 🏆 Saimme kisasta uusia ideoita Annifin kehitykseen! SemEval-2025-työpajan julkaisut ovat nyt luettavissa 👉 https://aclanthology.org/volumes/2025.semeval-1/
Haaste […]

Original post on some.kansalliskirjasto.fi

some.kansalliskirjasto.fi

August 28, 2025 at 10:58 AM

arxiv cs.CL

@arxiv-cs-cl.bsky.social

Ladislav Lenc, Daniel C\'ifka, Ji\v{r}\'i Mart\'inek, Jakub \v{S}m\'id, Pavel Kr\'al
UWBa at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval
https://arxiv.org/abs/2508.09517

August 14, 2025 at 6:50 AM

arXiv cs.CL Computation and Language

@cscl-bot.bsky.social

Ladislav Lenc, Daniel C\'ifka, Ji\v{r}\'i Mart\'inek, Jakub \v{S}m\'id, Pavel Kr\'al: UWBa at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval https://arxiv.org/abs/2508.09517 https://arxiv.org/pdf/2508.09517 https://arxiv.org/html/2508.09517

August 14, 2025 at 6:30 AM

arXiv cs.CL Computation and Language

@cscl-bot.bsky.social

Adri\'an Gude, Roi Santos-R\'ios, Francisco Prado-Vali\~no, Ana Ezquerro, Jes\'us Vilares: LyS at SemEval 2025 Task 8: Zero-Shot Code Generation for Tabular QA https://arxiv.org/abs/2508.09012 https://arxiv.org/pdf/2508.09012 https://arxiv.org/html/2508.09012

August 13, 2025 at 6:30 AM

arxiv cs.CL

@arxiv-cs-cl.bsky.social

Adri\'an Gude, Roi Santos-R\'ios, Francisco Prado-Vali\~no, Ana Ezquerro, Jes\'us Vilares
LyS at SemEval 2025 Task 8: Zero-Shot Code Generation for Tabular QA
https://arxiv.org/abs/2508.09012

August 13, 2025 at 5:50 AM

Jim

@jimfhahn.bsky.social

... if I include myself in Gen X, then I might say as a very Gen X thing to say that I did not know I would be published in ACL and now I wish I had tried to find a better team name than Jim. Or whatever, aclanthology.org/2025.semeval...

Jim at SemEval-2025 Task 5: Multilingual BERT Ensemble

Jim Hahn. Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025). 2025.

aclanthology.org

August 11, 2025 at 5:45 PM

arxiv cs.CL

@arxiv-cs-cl.bsky.social

Catherine Kobus, Fran\c{c}ois Lancelot, Marion-C\'ecile Martin, Nawal Ould Amer
ATLANTIS at SemEval-2025 Task 3: Detecting Hallucinated Text Spans in Question Answering
https://arxiv.org/abs/2508.05179

August 8, 2025 at 6:45 AM

arXiv cs.CL Computation and Language

@cscl-bot.bsky.social

Catherine Kobus, Fran\c{c}ois Lancelot, Marion-C\'ecile Martin, Nawal Ould Amer: ATLANTIS at SemEval-2025 Task 3: Detecting Hallucinated Text Spans in Question Answering https://arxiv.org/abs/2508.05179 https://arxiv.org/pdf/2508.05179 https://arxiv.org/html/2508.05179

August 8, 2025 at 6:30 AM

arXiv cs.CL Computation and Language

@cscl-bot.bsky.social

Pranshu Rastogi: fact check AI at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-checked Claim Retrieval https://arxiv.org/abs/2508.03475 https://arxiv.org/pdf/2508.03475 https://arxiv.org/html/2508.03475

August 6, 2025 at 6:30 AM

arxiv cs.CL

@arxiv-cs-cl.bsky.social

Pranshu Rastogi
fact check AI at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-checked Claim Retrieval
https://arxiv.org/abs/2508.03475

August 6, 2025 at 6:01 AM

arxiv cs.CL

@arxiv-cs-cl.bsky.social

Jiyu Chen, Necva B\"ol\"uc\"u, Sarvnaz Karimi, Diego Moll\'a, C\'ecile L. Paris
CSIRO-LT at SemEval-2025 Task 11: Adapting LLMs for Emotion Recognition for Multiple Languages
https://arxiv.org/abs/2508.01161

August 5, 2025 at 10:53 AM

arXiv cs.CL Computation and Language

@cscl-bot.bsky.social

Jiyu Chen, Necva B\"ol\"uc\"u, Sarvnaz Karimi, Diego Moll\'a, C\'ecile L. Paris: CSIRO-LT at SemEval-2025 Task 11: Adapting LLMs for Emotion Recognition for Multiple Languages https://arxiv.org/abs/2508.01161 https://arxiv.org/pdf/2508.01161 https://arxiv.org/html/2508.01161

August 5, 2025 at 6:29 AM

Sara Rosenthal

@seirasto.bsky.social

🚀Excited to announce our MTRAGEval task at SemEval 2026!

Arxiv: arxiv.org/abs/2501.03468
Github: github.com/IBM/mt-rag-b... (please 🌟!)
MTRAGEval: ibm.github.io/mt-rag-bench...

MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation Systems

Retrieval-augmented generation (RAG) has recently become a very popular task for Large Language Models (LLMs). Evaluating them on multi-turn RAG conversations, where the system is asked to generate a ...

arxiv.org

August 4, 2025 at 6:33 AM

arXiv cs.CL Computation and Language

@cscl-bot.bsky.social

Atakan Site, Emre Hakan Erdemir, G\"ul\c{s}en Eryi\u{g}it: ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation https://arxiv.org/abs/2508.00762 https://arxiv.org/pdf/2508.00762 https://arxiv.org/html/2508.00762

August 4, 2025 at 6:30 AM

arxiv cs.CL

@arxiv-cs-cl.bsky.social

Atakan Site, Emre Hakan Erdemir, G\"ul\c{s}en Eryi\u{g}it
ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation
https://arxiv.org/abs/2508.00762

August 4, 2025 at 4:31 AM

fortext lab

@fortextlab.bsky.social

🔍 Sample data is already available!

We invite researchers, teams, and solo experimenters to benchmark systems and explore how machines understand stories.

All the details are on our site:
🌐 narrative-similarity-task.github.io

#SemEval #NLP #NarrativeAI #DH #CLS

SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning

narrative-similarity-task.github.io

August 1, 2025 at 3:18 PM

fortext lab

@fortextlab.bsky.social

🎉 We’re excited to kick things off! We’re launching SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning. This is a joint effort by @hanshatzel.bsky.social (University of Hamburg) and the fortext lab, as part of the PLANS project, in collaboration with Toloka AI.

August 1, 2025 at 3:15 PM

Narrative Similarity Task

@narrativesimtask.bsky.social

🚨 Shared Task Alert! 🚨
We are announcing the shared task on narrative similarity: SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning

We invite you to benchmark LLMs, embedding models, or even test your favorite narrative formalism. Sample data is now available!

SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning

narrative-similarity-task.github.io

August 1, 2025 at 1:42 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news