Lightnews — Scholar-powered news

Peter Vamplew

@amp1874.bsky.social

210 followers 220 following 220 posts

Professor in IT @ Federation Uni. Multi-objective reinforcement learning. Human-aligned AI. Best known for the f*cking mailing list paper. Jambo & Bengals fan. https://t.co/UNoOrbGApz

t.co

Posts Media Videos Starter Packs

Pinned

Peter Vamplew @amp1874.bsky.social · Nov 19

I'm going to post about a few of my key and/or recent papers to create some context for my bsky profile.

Probably the most important paper I've been part of: link.springer.com/article/10.1...
This practical guide reviews the 'why' and 'how' of multi-objective reinforcement learning.

A practical guide to multi-objective reinforcement learning and planning - Autonomous Agents and Multi-Agent Systems

Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learnin...

link.springer.com

Peter Vamplew @amp1874.bsky.social · 1d

Dear Benjamin,

Congrats on being the fastest scammy conference organiser of all time. Inviting me to a conference unrelated to the topic of my paper is highly questionable, but you did it within a day of publication so at least you’re fast.

Kindly remove me from your mailing list.

Regards,
Peter

Peter Vamplew @amp1874.bsky.social · 2d

It examines the components of effective human apologies, and analyses how and how well these have been implemented in prior apologetic AI systems. Haddie has done a fantastic job here - this is the most thorough and in-depth student publication that I've been fortunate enough to be involved in.

Peter Vamplew @amp1874.bsky.social · 2d

Computer says sorry?

After months in copy-editing hell, Haddie Harland's review of AI apology research is now available: link.springer.com/article/10.1...

This is a must read for anyone interested in how AI systems can effectively and appropriately use apologies to facilitate human interaction 1/2

AI apology: a critical review of apology in AI systems - Artificial Intelligence Review

Apologies are a powerful tool used in human-human interactions to provide affective support, regulate social processes, and exchange information following a trust violation. The emerging field of AI apology investigates the use of apologies by artificially intelligent systems, with recent research suggesting how this tool may provide similar value in human-machine interactions. Until recently, contributions to this area were sparse, and these works have yet to be synthesised into a cohesive body of knowledge. This article provides the first synthesis and critical analysis of the state of AI apology research, focusing on studies published between 2020 and 2023. We derive a framework of attributes to describe five core elements of apology: outcome, interaction, offence, recipient, and offender. With this framework as the basis for our critique, we show how apologies can be used to recover from misalignment in human-AI interactions, and examine trends and inconsistencies within the field. Among the observations, we outline the importance of curating a human-aligned and cross-disciplinary perspective in this research, with consideration for improved system capabilities and long-term outcomes.

link.springer.com

Peter Vamplew @amp1874.bsky.social · 4d

Have it also make the initial suggestion of a dish, otherwise there's a risk you might accidentally choose something which you do have the ingredients for.

Peter Vamplew @amp1874.bsky.social · 6d

Decisions, decisions. Which should I read first?

Peter Vamplew @amp1874.bsky.social · 7d

This has saved me so many hours. www.tablesgenerator.com

Create LaTeX tables online – TablesGenerator.com

Easily create even complex LaTeX tables with our online generator – you can paste data from a spreadsheet, merge cells, edit borders and more.

www.tablesgenerator.com

2 1 5

Reposted by Peter Vamplew

arXiv cs.LG Machine Learning @cslg-bot.bsky.social · 14d

Dsouza, Ofosu, Amaogu, Pigeon, Boudreault, Maghoul, Moreno-Cruz, Leonenko: BoreaRL: A Multi-Objective Reinforcement Learning Environment for Climate-Adaptive Boreal Forest Management https://arxiv.org/abs/2509.19846 https://arxiv.org/pdf/2509.19846 https://arxiv.org/html/2509.19846

1 1

Reposted by Peter Vamplew

arxiv cs.CL @arxiv-cs-cl.bsky.social · 9d

Lingxiao Kong, Cong Yang, Oya Deniz Beyan, Zeyd Boukhers
Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective
https://arxiv.org/abs/2509.21613

Reposted by Peter Vamplew

Tim Miller @tmiller-uq.bsky.social · 13d

The deadline for my postdoc on scalable clinical decision support is closing in 1 week: 4 October (Australian Eastern standard Time). Please share with anyone that you think would be interested

Tim Miller @tmiller-uq.bsky.social · 23d

I'm hiring again! Please share. I'm recruiting a postdoc research fellow in human-centred AI for scalable decision support. Join us to investigate how to balance scalability and human control in medical decision support. Closing date: 4 October (AEST).
uqtmiller.github.io/recruitment/

Recruitment

uqtmiller.github.io

Peter Vamplew @amp1874.bsky.social · 15d

@tresvillain.bsky.social Andrew, for consistency you need to change your name to Trendvillian :-)

Peter Vamplew @amp1874.bsky.social · 15d

Dear authors who I shall not name. Thank you for citing my work. But I'm not sure that a paper dating from 1995 should be cited in the context of a paragraph which begins "Recent trends show...."

1 3

Peter Vamplew @amp1874.bsky.social · 15d

It might be skewed by the topics I search, but I find that almost every response I get contains at least one statement which is clearly wrong. So I just skip past them these days and go to the search results. It's not just Google, I've found Copilot Pro to be just as bad.

Peter Vamplew @amp1874.bsky.social · 15d

Your student might find this paper by @lnalegre.bsky.social interesting: arxiv.org/pdf/2505.23708

arxiv.org

1 2

Peter Vamplew @amp1874.bsky.social · 16d

Why would you love them? My experience is that they are blatantly incorrect about 90% of the time.

1 2

Reposted by Peter Vamplew

Upol Ehsan | hiring PhD students Fall'26 @upolehsan.bsky.social · 16d

⚠️ The #CHI2026 paper I submitted? It almost didn't exist. That's the BTS part academics never post. So I will…to normalize what I call unglamorous persistence.

This summer was one of my hardest, mentally. 🌥️ Between ...
1/n

1 1 8

Peter Vamplew @amp1874.bsky.social · 17d

Hey! There's finally someone else in Australia doing research in multi-objective reinforcement learning. @marcusgal.bsky.social arxiv.org/pdf/2509.14816

arxiv.org

1 2

Peter Vamplew @amp1874.bsky.social · 18d

Interesting. We've noticed that changes in greedy policy can cause interference in the vector values learned by a multi-objective RL agent which hurts learning, but hadn't measured how frequently that happens. This paper suggests it might be a bigger issue than we thought.

1 1

Reposted by Peter Vamplew