Ivan Kartáč
@ivankartac.bsky.social
48 followers 200 following 3 posts
PhD student @ Charles University. Researching evaluation and explainability of reasoning in language models.
Posts Media Videos Starter Packs
ivankartac.bsky.social
OpeNLGauge comes in two variants: a prompt-based ensemble and a smaller fine-tuned model, both built exclusively on open-weight LLMs (including training data!).

Thanks @tuetschek.bsky.social and @mlango.bsky.social!
ivankartac.bsky.social
We introduce an explainable metric for evaluating a wide range of natural language generation tasks, without any need for reference texts. Given an evaluation criterion, the metric provides fine-grained assessments of the output by highlighting and explaining problematic spans in the text.
ivankartac.bsky.social
Our paper "OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMs" has been accepted to #INLG2025 conference!

You can read the preprint here: arxiv.org/abs/2503.11858
Reposted by Ivan Kartáč
ufal.mff.cuni.cz
#ACL2025NLP in Vienna 🇦🇹 starts today with 23 🤯 @ufal-cuni.bsky.social folks presenting their work both at the main conference and workshops. Check out our main conference papers today and on Wednesday 👇
Reposted by Ivan Kartáč
ufal.mff.cuni.cz
Today, @tuetschek.bsky.social shared the work of his team on evaluating LLM text generation with both human annotation frameworks and LLM-based metrics. Their approach tackles the benchmark data leakage problem and how to get unseen data for unbiased LLM testing.
Reposted by Ivan Kartáč
zdenekkasner.bsky.social
How do LLMs compare to human crowdworkers in annotating text spans? 🧑🤖

And how can span annotation help us with evaluating texts?

Find out in our new paper: llm-span-annotators.github.io

Arxiv: arxiv.org/abs/2504.08697
Large Language Models as Span Annotators
Website for the paper Large Language Models as Span Annotators
llm-span-annotators.github.io