@ehudreiter.bsky.social
100 followers 28 following 130 posts
Posts Media Videos Starter Packs
ehudreiter.bsky.social
Somewhat frustrated yesterday to once again read ACL paper which did all sorts of complex things (including the usual results tables showing best approach) on garbage data. With minimal ack of this in limitations. Most fundamental rule of CS is Garbage In, Garbage Out
ehudreiter.bsky.social
New blog: Good diagrams for research papers

Ive seen a number of diagrams recently which are too complicated and difficult to understand. I explain some of the problems I see and give advice.

ehudreiter.com/2025/10/08/g...
Good diagrams for research papers
Ive seen a number of diagrams recently which are too complicated and difficult to understand. I explain some of the problems I see and give advice.
ehudreiter.com
ehudreiter.bsky.social
Several people have asked me recently if I will still be able to contribute to research projects after I retire in summer 2026. Absolutely! I will have emeritus statius, and am very hapy to remain involved in research projects at Aberdeen amd elsewhere.
ehudreiter.bsky.social
New blog: Reflections on blogging

I am often asked about my experience blogging, sometimes by people who are considering writing their own blog. In this “meta” blog, I summarise my thoughts and experiences about my blog.

ehudreiter.com/2025/09/23/r...
Reflections on blogging
I am often asked about my experience blogging, sometimes by people who are considering writing their own blog. In this “meta” blog, I summarise my thoughts and experiences about my blog…
ehudreiter.com
ehudreiter.bsky.social
Aberdeen CS will probably be looking for a new lecturer in NLP. Formal advert is not out yet, but feel free to contact me informally if interested.
Reposted
siggen.bsky.social
The registration page for #INLG2025 is now live! Join us in Vietnam at the Oct 29 - Nov 2 for the best conference on #NaturalLanguageGeneration

2025.inlgmeeting.org/registration...

Curious to see what will be presented? Check out this list of accepted papers! 2025.inlgmeeting.org/accepted-pap...
Picture of the One Pillar Pagoda in Hanoi, a pagoda raised up over a green pond surrounded by greenery
ehudreiter.bsky.social
New blog: Defining hallucination is not straightforward

Many researchers assume that hallucination is a binary feature; either something is a hallucination or it is not. This is too simplistic. I describe some of the issues I have seen below.

ehudreiter.com/2025/09/10/d...
Defining hallucination is not straightforward
Most academic work assumes that hallucination is a binary feature: either something is a hallucination or it is not a hallucination. But this is too simplistic. In real-world contexts we see many s…
ehudreiter.com
ehudreiter.bsky.social
At ACL, I engaged with 50 papers (went to oral, talked to poster person). Decided (looked at paper sometimes), that 3 of these robust, interesting, relevant to me; 2 of these 3 won awards. Hum, maybe in future I should focus on 40 award papers, ignore the other 3000?
ehudreiter.bsky.social
Last week I had to deal with two cases of papers containing hallucinated references. This is not acceptable! Shows complete disdain for understand prev work, and suggests rest of paper may be fabricated.

Ok to use LLM to suggest related work, but read (or at least skim) them!
ehudreiter.bsky.social
Watched recording of ACL panel on generalisability (recommended to me). I share concerns about "LLM popcorn", but my biggest concern about NLP is lack of research diversity. Everyone does LLM, few people do impact or qual eval, little interest in genuine collab with other fields
ehudreiter.bsky.social
New blog: I hate pay-to-publish

The academic world has changed since I got my PhD in 1990. One of the worst changes is that researchers now often pay thousands of pounds to publish their work. Unfair to researchers with limited funding, and bad for science.

ehudreiter.com/2025/08/19/i...
I hate pay-to-publish
The academic world has changed in many ways since I got my PhD in 1990. One of the worst changes is that researchers in 2025 usually need to pay thousands of pounds to publish their work. This is u…
ehudreiter.com
Reposted
nfel.bsky.social
Excited to announce the first-ever Workshop for Young Researchers in Natural Language Generation (YNLG), supported by @siggen.bsky.social, taking place on October 29, 2025 in Hanoi, Vietnam, co-located with INLG 2025.
Call for Submissions is out now!

ynlg-workshop.github.io
ehudreiter.bsky.social
New blog: More on evaluating impact

I got great feedback from recent paper and talk on eval impact, and summarise some of the suggested papers (including more examples of impact eval) and insightful comments (eg, about eval “ecosystem”) I received.

ehudreiter.com/2025/08/05/m...
More on evaluating impact
I recently published a paper and gave a talk about evaluating real-world impact. I got some great feedback from this, and summarise some of the suggested papers (including more examples of impact e…
ehudreiter.com
ehudreiter.bsky.social
I'll be at ACL next week (Tue-Thur, not Sun/Mon). Look forward to meeting old friends and new people who want to connect! Ill also be giving an invited talk on impact evaluation at the GEM workshop on Thur 31 July
ehudreiter.bsky.social
Really happy that this survey of NLP in cancer care, from my student Mengxuan Sun , has finally appeared (its been a saga). One key but depressing finding is that evaluation quality is uniformly dreadful by medical standards; NLP researchers just dont seem to care...

doi.org/10.1016/j.ar...
Redirecting
doi.org
ehudreiter.bsky.social
Motivated by recent discussion with my group:
Ignore subjective statements such as "I find LLMs to be incredibly useful for XX", especially when made by people (such as AI companies or gurus) who have strong biases/incentives/COI .
ehudreiter.bsky.social
Nice example of using RCT to measure real-world impact of LLMs (and discovering that it is disappointing)
metr.org
METR @metr.org · Jul 10
We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers.

The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.
ehudreiter.bsky.social
Good point, in some cases I have struggled to convince companies to publish. But in other cases we could publish. I guess depends on the company and the people who make this decision, and also on what is being published (eg very hard to publish negative result about company's product!)
ehudreiter.bsky.social
I'll also give an invited talk about impact evaluation at the ACL GEM workshop
ehudreiter.bsky.social
Ive written a "Last Word" opinion piece for CL about evaluating real-world impact. It
* looks at how impact can be evaluated
* shows via a structured survey that perhaps 0.1% of ACL Anth papers measure real-world impact
* discusses why this is the case

arxiv.org/abs/2507.05973
We Should Evaluate Real-World Impact
The ACL community has very little interest in evaluating the real-world impact of NLP systems. A structured survey of the ACL Anthology shows that perhaps 0.1% of its papers contain such evaluations; ...
arxiv.org
ehudreiter.bsky.social
Looked at Google Scholar, nice to see that my h-index has reached 60