banner
ltgoslo.bsky.social
@ltgoslo.bsky.social
55 followers 31 following 18 posts
The Language Technology Group (LTG) at the University of Oslo, Norway do research on a range of topics in Natural Language Processing (NLP), including language modeling for Norwegian and other languages.
Posts Media Videos Starter Packs
4. #BabyLM challenge description paper, co-authored by Lucas Georges Gabriel Charpentier

babylm.github.io
babylm.github.io
3. "EdinHelsOW WMT 2025 CreoleMT System Description: Improving Lusophone Creole Translation through Data Augmentation, Model Merging and LLM Post-editing" by Jacqueline Rowe, Ona de Gibert, Mateusz Klimaszewski, Coleman Haley, Alexandra Birch and Yves Scherrer
(proc. of WMT)
www2.statmt.org/wmt25/
WMT 2025
www2.statmt.org
2. "Improved Norwegian Bokmål Translations for FLORES" by Petter Mæhlum, Anders Næss Evensen and Yves Scherrer
(in proceedings of the WMT 2025 workshop)
www2.statmt.org/wmt25/
WMT 2025
www2.statmt.org
The #EMNLP2025 conference is starting in two weeks in Suzhou, China. @emnlpmeeting.bsky.social @ltgoslo.bsky.social

The Oslo Language Technology Group will be there with at least four papers, see the thread🧵:
We're hiring! A postdoc-level researcher position in NLP, focusing on generative approaches to event extraction, is open at the University of Oslo. The contract is for 30 months. Closing date 11 Aug. Come join us! www.jobbnorge.no/en/available...
Researcher in Natural Language Processing (283057) | University of Oslo
Job title: Researcher in Natural Language Processing (283057), Employer: University of Oslo, Deadline: Monday, August 11, 2025
www.jobbnorge.no
1. "An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT)". LTG co-authors: Nikolay Arefyev, Mariia Fedorova, Andrey Kutuzov, Petter Mæhlum, Vladislav Mikhailov, Stephan Oepen, David Samuel and many others from hplt-project.org
arxiv.org/abs/2503.10267 (main ACL)
HPLT - High Performance Language Technologies
A space that combines petabytes of natural language data with large-scale model training
hplt-project.org
LTG – the Oslo Language Technology Group – will be presenting five papers at the #ACL2025NLP conference of @aclmeeting.bsky.social this summer in #Vienna, see paper descriptions below 🧵
📄 Multi-label Scandinavian Language Identification (SLIDE), by Fedorova et al.

📄 Interactive maps for corpus-based dialectology, by Scherrer et al.
📄 NorEventGen: generative event extraction from Norwegian news, by You et al.

📄 Mixed Feelings: Cross-Domain Sentiment Classification of Patient Feedback, by Rønningstad et al.
📄 Large Language Models for Small Languages: A Study of Continual Pretraining on Languages of Norway, by Samuel et al.

📄 Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles, by Touileb et al.
📄 A Collection of Question Answering Datasets for Norwegian, by Mikhailov et al.

📄 The Impact of Copyrighted Material on Large Language Models: A Norwegian Perspective, by de la Rosa et al.
Over the coming days, LTG will be presenting 8 fresh papers at the NoDaLiDa/Baltic-HLT conference in Tallinn 🔥

Several of these represent collaborations with colleagues from UiB, NTNU, and the National Library of Norway. 🤝

Come see us if your'e at #NoDaLiDa

See list of papers in the 🧵 below: