Willem Röpke
@willemropke.bsky.social
1.3K followers 410 following 59 posts
PhD student | Interested in all things decision-making and learning
Posts Media Videos Starter Packs
Pinned
willemropke.bsky.social
Exciting news! My paper on multi-objective reinforcement learning was accepted at AAMAS 2025!

We introduce IPRO (Iterated Pareto Referent Optimisation)—a principled approach to solving multi-objective problems.

🔗 Paper: arxiv.org/abs/2402.07182
💻 Code: github.com/wilrop/ipro
willemropke.bsky.social
I think the Qwen team is missing up on a huge opportunity to basically be the default model in all neurips submissions by not releasing Qwen3
willemropke.bsky.social
Using LLMs to come up with prompts for LLMs to then ask the LLMs to then train the LLMs to then ....
willemropke.bsky.social
RIP to my investments from the past few years, it was nice seeing the green while it lasted
willemropke.bsky.social
The people demand Qwen3!
willemropke.bsky.social
I've been bashing my head against a wall trying to make TRL and their new vllm-serve work and holy moly it's just an infinite pain

why must i suffer
willemropke.bsky.social
Why does reading a book feel so much more satisfying than watching a TV show? Both are ways of consuming content so I don't get the difference
willemropke.bsky.social
Bought a cherry coke on accident today.

Horrible things happening everywhere apparently
willemropke.bsky.social
This is actually insanely clever, I would've never thought about this. Seems very interesting and important to fix!
zoeshao.bsky.social
What happens if we tokenize cat as [ca, t] rather than [cat]?

LLMs are trained on just one tokenization per word, but they still understand alternative tokenizations. We show that this can be exploited to bypass safety filters without changing the text itself.

#AI #LLMs #tokenization #alignment
willemropke.bsky.social
I don't recall seeing a video in the recent past that depressed me as much as what I just watched unfolding in the Oval Office
willemropke.bsky.social
5/ This was a collaborative effort across multiple universities that began over a year ago. A huge thanks to my co-authors for seeing it through with me and everyone who shared valuable insights along the way.

If you're interested in our work, I'd love to hear from you!
willemropke.bsky.social
4/ Beyond RL, IPRO has applications in other domains like multi-objective path planning, which we’ve recently added support for to the codebase! If you work on decision-making under trade-offs, this might be relevant to you.
willemropke.bsky.social
3/ By incorporating oracles with theoretical guarantees, we can leverage these for the multi-objective problem. At the same time, we can adapt strong RL algorithms such as DQN, A2C, and PPO, making IPRO both practical and theoretically sound.
willemropke.bsky.social
2/ IPRO decomposes the multi-objective problem into a sequence of single-objective problems. By solving each step efficiently, it systematically explores the search space while keeping track of what remains.
willemropke.bsky.social
1/ In many real-world problems, agents must balance multiple conflicting objectives—think of self-driving cars optimising speed vs. safety or AI assistants trading off response quality vs. efficiency.

How can we design efficient RL algorithms for such settings?
willemropke.bsky.social
Exciting news! My paper on multi-objective reinforcement learning was accepted at AAMAS 2025!

We introduce IPRO (Iterated Pareto Referent Optimisation)—a principled approach to solving multi-objective problems.

🔗 Paper: arxiv.org/abs/2402.07182
💻 Code: github.com/wilrop/ipro
willemropke.bsky.social
How can I stop ChatGPT from talking to me with emojis, this is just the worst update I've ever experienced.

I've put it in its memory, in my details, and I even repeat it in the chat but it's just replying like 👉🥺👈
willemropke.bsky.social
Macron is the goat

French people don't appreciate true genius
willemropke.bsky.social
Why did OpenAI update chatGPT to use emojis in its responses? I hate it and even when I explicitly say this it just keeps doing it.
willemropke.bsky.social
To whomever put my email in some spam list: I fart in your general direction
willemropke.bsky.social
The fact that in the year 2025 we are still dealing with the stupid "make the paper fit in an arbitrary format for the camera ready submission" minigame is killing me.

Either let me group authors or let me put acknowledgements after the main text. This isn't hard.
willemropke.bsky.social
Does anyone have any good hacks for making the AAMAS template not suck for people with multiple affiliations? I lose a gazillion lines for basically no reason...