Desmond Elliott
@delliott.bsky.social
380 followers 57 following 30 posts
Posts Media Videos Starter Packs
Reposted by Desmond Elliott
tokshop.bsky.social
Three invited speakers will share their insights at TokShop! Hear from Yuval Pinter @uvp.bsky.social, Desmond Elliott @delliott.bsky.social, and Adrian Łańcuck on cutting-edge tokenization research. Don't miss these keynote presentations! #ICML2025 tokenization-workshop.github.io/speakers
Reposted by Desmond Elliott
iyadrahwan.bsky.social
It is with great pleasure that I share MAXMINDS 2.0, a new Max Planck program to support scholars in danger of displacement by war or natural disasters, and who have limited access to resources and institutional support.

If you know affected scholars, please share.

www.maxminds.mpg.de
MAXMINDS 2.0 Homepage
MAXMINDS 2.0
www.maxminds.mpg.de
delliott.bsky.social
📢I am hiring a Postdoc to work on post-training methods for low-resource languages. Apply by August 15 employment.ku.dk/faculty/?sho....
Let's talk at #ACL2025NLP in Vienna if you want to know more about the position and life in Denmark.
Postdoc in Natural Language Processing
employment.ku.dk
Reposted by Desmond Elliott
valentinapy.bsky.social
💡Beyond math/code, instruction following with verifiable constraints is suitable to be learned with RLVR.
But the set of constraints and verifier functions is limited and most models overfit on IFEval.
We introduce IFBench to measure model generalization to unseen constraints.
delliott.bsky.social
📣 I am happy to support Ph.D applications to the Danish Advanced Research Academy. My main areas of research include multimodal learning and tokenization-free language processing. Feel free to reach out if you have similar interests! Applications due August 29 www.daracademy.dk/fellowship/f...
Dara
www.daracademy.dk
Reposted by Desmond Elliott
iccv.bsky.social
Following #CVPR2025, #ICCV2025 implemented a new policy targeting accountability and integrity. PCs identified 25 highly irresponsible reviewers, resulting in the desk rejection of 29 associated papers, including 12 submissions that otherwise would have been accepted.
delliott.bsky.social
The participants brought a lot of energy, enthusiasm, and great posters to highlight their research: @antoniakrm.bsky.social and @saravera.bsky.social pictured.

Finally, I want to think the Danish Data Science Academy, Carlsberg Foundation, and the Villum Foundation for supporting the event!
Sara presenting her poster on reasoning with DeepSeek-R1 Antonia presenting her poster (not visible in the image)
delliott.bsky.social
Huge thanks to everyone that attended the Copenhagen NLP Symposium last week. Thanks for our wonderful speakers @kylelo.bsky.social, @najoung.bsky.social, Yohei Oseki, @mziizm.bsky.social, and @loubnabnl.hf.co! @mariaa.bsky.social did a great job of summarizing the talks in these liveposts (quoted).
People finding their seats before the event started
delliott.bsky.social
No, we didn’t record anything but there was an excellent live-poster!
Reposted by Desmond Elliott
scfrank.bsky.social
📯 Best Paper Award at CVPR workshop on Visual concepts for our (@doneata.bsky.social + @delliott.bsky.social) paper on probing vision/lang/ vision+lang models for semantic norms!

TLDR: SSL vision models (swinV2, dinoV2) are surprisingly similar to LLM & VLMs even w/o lang 👀
arxiv.org/abs/2506.03994
delliott.bsky.social
Your workshop is so popular that someone managing the door on a one-in one-out basis.
delliott.bsky.social
I am looking forward to meeting people working on multimodality at #CVPR2025. You can find me hopping between the @vlms4all.bsky.social and Visual Concepts Workshops on Thursday. Feel free to reach out if you want to grab a coffee ☕ or a beer 🍻 during the week!
Where and when to find me at #CVPR2025 this week
Reposted by Desmond Elliott
ilkerkesen.bsky.social
Announcing our recent work “Multilingual Pretraining for Pixel Language Models”! We introduce PIXEL-M4, a pixel language model pretrained on four visually & linguistically diverse scripts: English, Hindi, Ukrainian & Simplified Chinese. #NLProc
Reposted by Desmond Elliott
srishtiy.bsky.social
I am excited to announce our latest work 🎉 "Cultural Evaluations of Vision-Language Models Have a Lot to Learn from Cultural Theory". We review recent works on culture in VLMs and argue for deeper grounding in cultural theory to enable more inclusive evaluations.

Paper 🔗: arxiv.org/pdf/2505.22793
Paper title "Cultural Evaluations of Vision-Language Models
Have a Lot to Learn from Cultural Theory"
Reposted by Desmond Elliott
annarogers.bsky.social
📢 The Copenhagen NLP Symposium on June 20th!

- Invited talks by @loubnabnl.hf.co (HF) @mziizm.bsky.social (Cohere) @najoung.bsky.social (BU) @kylelo.bsky.social (AI2) Yohei Oseki (UTokyo)
- Exciting posters by other participants

Register to attend and/or present your poster at cphnlp.github.io /1
Copenhagen NLP Symposium 2025
symposium website
cphnlp.github.io
Reposted by Desmond Elliott
mdlhx.bsky.social
Interested in multilingual tokenization in #NLP? Lisa Beinborn and I are hiring!

PhD candidate position in Göttingen, Germany: www.uni-goettingen.de/de/644546.ht...

PostDoc position in Leuven, Belgium:
www.kuleuven.be/personeel/jo...

Deadline 6th of June
Stellen OBP - Georg-August-Universität Göttingen
Webseiten der Georg-August-Universität Göttingen
www.uni-goettingen.de
Reposted by Desmond Elliott
mariaa.bsky.social
Has anyone written anything about *scraping and text processing* for internet pretraining data? Practical details, which tools are used, which webpage elements are considered, how HTML to text conversion is done?

(I know about work on quality filters, relevant but not quite what I'm looking for)
delliott.bsky.social
Thanks for sharing! I'm looking forward to reading this because I enjoyed reading your lecture notes on Natural Language Understanding with Distributed Representation back in the day.
Reposted by Desmond Elliott
lampinen.bsky.social
Had fun talking at the Spurious Correlations & Shortcut Learning at ICLR! One example I brought up, which I think provides an uncommon perspective: a case where spurious shortcuts can improve generalization... even to out-of-distribution sets where the spurious feature doesn't generalize! Thread:
delliott.bsky.social
What would you do if someone has rolled your dataset into their benchmark (cool!) but marked it as being available under a much more permissive license (not so cool)?
delliott.bsky.social
I'm recruiting a postdoc on an 18-month contract candidate.hr-manager.net/ApplicationI.... The position is about deploying LLMs in the Danish public sector. This is an interdisciplinary project that touches on technical, ethical, and legal aspects of LLM usage. Apply by 1 May 2025.
Postdoctoral Researcher in Natural Language Processing
Postdoc in Natural Language Processing, Department of Computer Science, Faculty of Science, University of Copenhagen The Natural Language Process
candidate.hr-manager.net
Reposted by Desmond Elliott
mziizm.bsky.social
Very excited to release Kaleidoscope—a multilingual, multimodal evaluation set for VLMs, built as part of our open-science initiative!

🌍 18 languages (high-, mid-, low-)
📚 21k questions (55% require image understanding)
🧪 STEM, social science, reasoning, and practical skills
Reposted by Desmond Elliott
cohereforai.bsky.social
🚀 We are excited to introduce Kaleidoscope, the largest culturally-authentic exam benchmark.

📌 Most VLM benchmarks are English-centric or rely on translations—missing linguistic & cultural nuance. Kaleidoscope expands in-language multilingual 🌎 & multimodal 👀 VLMs evaluation