Marcel Bollmann
@marcel.bollmann.me
1.7K followers 350 following 120 posts
Associate professor at @liu.se 🇸🇪, site development lead for @aclanthology.org, editor-in-chief at @nejlt.bsky.social. Mildly obscure #NLP researcher. I like coffee and board games. 🏠 https://marcel.bollmann.me/
Posts Media Videos Starter Packs
Pinned
marcel.bollmann.me
Hi, I’m Marcel! I work on #NLP for lesser-resourced languages, multilingual NLP, cross-lingual transfer, tokenisation, and trying to keep up with the flood of LLM-related research. Joining Bluesky seems like a good opportunity for a new & shiny #introduction, so here goes!
marcel.bollmann.me
I often long for a place to just post whimsical personal updates for friends, but that kind of place doesn’t exist anymore. In my personal bubble, social media has long become too fragmented and/or abandoned for that purpose.
Reposted by Marcel Bollmann
zehavoc.bsky.social
Just found out that yet another paper on North African Arabizi didn't find our work worth citing. they even wrote "No Arabizi-specific metric or resource exists for our dialect selection ». We were the first to release an annotated dataset for this dialect, published at acl and shit. Discouraging.
Your dataset looks very cool, but I don't understand why you say “no Arabizi-specific metric or resource exists for our dialect selection”? When you contacted me, it seemed to me that you were aware of my work on Arabizi (e.g., [1,2], not to mention the cross-lingual work with Maltese [2] or character-based language models for Arabizi [4]). One of the crucial points of this work was also to propose translations into French from Algerian Arabizi, which could have helped you use a ground truth for your translation models. I'll be honest with you, I find it extremely discouraging to see that pioneering work in the processing of a language with such limited resources as Algerian Arabic dialect is not cited, even though it has been published in the major conference in the field and the data is freely available (unlike the vast majority of dialectal resources for Arabic). If even colleagues working on the same language don't find it necessary to cite us, what's the point of investing so much time and money in this type of work?

In short, I hope your work doesn't encounter the same pitfalls.


[1] https://www.aclweb.org/anthology/2020.acl-main.107.pdf
[2] https://arxiv.org/abs/2306.14866
[3] https://arxiv.org/abs/2005.00318
[4] https://arxiv.org/abs/2110.13658

(deepL translated, from French)
Reposted by Marcel Bollmann
nejlt.bsky.social
📄 New article published:

“Controlling Language and Style of Multi-lingual Generative Language Models with Control Vectors” by Julius Leino & Jussi Karlgren

nejlt.ep.liu.se/article/view...
Reposted by Marcel Bollmann
pranav-nlp.bsky.social
I'm conducting research on how ACL's peer review policies impact NLP research quality, career trajectories, and inclusivity within our community. I am running a survey, which would take around 7-10 mins to complete: forms.cloud.microsoft/e/j2jr9nH3X0

I would really appreciate insights from y'all!
I'm conducting research on how ACL's peer reviewing policies impact NLP research quality, career trajectories, and inclusivity within our community. Your insights—whether you're a seasoned reviewer, early-career researcher, or anywhere in between—are invaluable.
The survey takes 7-10 minutes and covers topics like review quality, reviewer assignment, and accessibility barriers. All responses are confidential and will help inform evidence-based improvements to our peer review processes.
Reposted by Marcel Bollmann
weissweiler.bsky.social
📢Life update📢

🥳I'm excited to share that I've started as a postdoc at Uppsala University NLP @uppsalanlp.bsky.social, working with Joakim Nivre on topics related to constructions and multilinguality!

🙏Many thanks to the Walter Benjamin Programme of the DFG for making this possible.
marcel.bollmann.me
I need a gym without any people at all, that would motivate me
Reposted by Marcel Bollmann
joachimbaumann.bsky.social
🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825
We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation".
We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks.
For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations.
Then, we collect 13 million LLM annotations across plausible LLM configurations.
These annotations feed into 1.4 million regressions testing the hypotheses. 
For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions.
Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors.
Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models.
Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.
Reposted by Marcel Bollmann
gracekind.net
Never ask a man his age, a woman her salary, or GPT-5 whether a seahorse emoji exists
Reposted by Marcel Bollmann
markriedl.bsky.social
OpenAI is discovering what every social media company has also discovered: content moderation is hard and AI content moderation is also hard.
marcel.bollmann.me
Why does every social media feed eventually end up looking like:

[outrageous thing happening in the US]
[extremely polarizing AI take]
[random semi-funny meme]
[shocking thing happened to person I don't know]
[yet another reason climate change is worse than we thought]

It's so emotionally tiring.
Reposted by Marcel Bollmann
davehowcroft.com
Idk who needs to hear this, but you do *not* need to glaze the reviewers of your papers when you respond to their feedback.

Be grateful, sure, but don't wax poetic about how insightful and magical their farts are.

It's just professional correspondence, my guy.
Reposted by Marcel Bollmann
jurafsky.bsky.social
Now that school is starting for lots of folks, it's time for a new release of Speech and Language Processing! Jim and I added all sorts of material for the August 2025 release! With slides to match! Check it out here: web.stanford.edu/~jurafsky/sl...
Speech and Language Processing
Speech and Language Processing
web.stanford.edu
marcel.bollmann.me
German is my native language :) Interestingly I can't think of any way I've ever heard anyone refer to this button. I'm mostly used to constructions that avoid naming it altogether, such as "hast du gedrückt?" or "ist schon gedrückt" ("did you push?/it's already pushed")
marcel.bollmann.me
It sounds so ridiculous to me and I think no sane person would ever say that in conversation, yet the public transport companies use it on signs and in spoken announcements as if it was the most normal thing to say.
marcel.bollmann.me
German is infamous for its compound words, but one of my personal favorites that amazes me every time is "Haltewunschtaste", the button that you push on a bus or tram to indicate you want to get off, a.k.a. the "stopping wish button".
Reposted by Marcel Bollmann
togelius.bsky.social
New blog post: AI Allergy.

On my increasing disgust with the AI discourse, even though I still like the technical and philosophical. And how I wish I could be excited about AI again.

togelius.blogspot.com/2025/08/ai-a...
AI Allergy
I remember being excited about AI. I remember 20 years ago, being excited about neuroevolutionary methods for learning adaptive behaviors in...
togelius.blogspot.com
Reposted by Marcel Bollmann
mdlhx.bsky.social
"Hang on a moment while we sign you out." - sign me in, outlook, IN!
marcel.bollmann.me
Writing on any kind of social network always feels like approaching strangers in the Wild West – they always react like you might pull a gun at them any second. I'm so tired of it.
marcel.bollmann.me
Making changes—meaningless or not—with the sole intent of gaming the peer-review system sounds like a better way of defining misconduct here. But the hard problem in practice is of course proving the author’s intent.
marcel.bollmann.me
My point is that “changes to a paper not made with the intention to improve it” is not a good criterion for misconduct IMO, as we make changes for other reasons all the time (including what Emile said about appeasing reviewers).
marcel.bollmann.me
DeepL when translating to Swedish: Have you considered writing in Japanese?
A screenshot from the DeepL translator, showing a Swedish translation with “Alternatives” underneath it, which contains entirely Japanese text.
marcel.bollmann.me
We make meaningless changes all the time e.g. to fit within page limits, those also often don't make the paper better in the eyes of the authors...
Reposted by Marcel Bollmann
iaugenstein.bsky.social
🎓 Looking for PhD opportunities in #NLProc for a start in Spring 2026?

🗒️ Add your expression of interest to join @copenlu.bsky.social here by 20 July: forms.office.com/e/HZSmgR9nXB

Selected candidates will be invited to submit a DARA fellowship application with me: daracademy.dk/fellowship/f...
Microsoft Forms
forms.office.com
Reposted by Marcel Bollmann
tomerullman.bsky.social
a man throws a glass apple at a concrete egg, causing it to shatter