Lightnews — Scholar-powered news

Joan Velja

@joanvelja.bsky.social

13 followers 150 following 6 posts

Container of multitudes | MS AI @UvA Amsterdam | Prev @LondonSafeAI

joanvelja.vercel.app/about

Posts Replies Media Videos

Reposted by Joan Velja

Karim Abdel Sadek

@karimabdel.bsky.social

I will be at @neuripsconf.bsky.social this week!

Would love to chat about Multi-agent systems, RL, Human-AI Alignment, or anything interesting :)

I'm also applying for PhD programs this cycle, feel free to reach out for any advice!

More about me: karim-abdel.github.io

December 8, 2024 at 11:59 PM

Joan Velja

@joanvelja.bsky.social

I’ll be at #NeurIPS for the first time :)

I’ll be presenting arxiv.org/abs/2410.03768 at SoLaR!

I’d be happy to chat all things Alignment and Safety, test-time compute and RL.

I’m also looking for PhD programs starting Fall 2025 👀, lmk if you would like to talk it over a coffee!

Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs

The rapid proliferation of frontier model agents promises significant societal advances but also raises concerns about systemic risks arising from unsafe interactions. Collusion to the disadvantage of...

arxiv.org

December 9, 2024 at 12:13 AM

Joan Velja

@joanvelja.bsky.social

Is this the most consequential finding of modern AI to date?
(Sentiment Neuron paper, openai.com/index/unsupe...)

November 19, 2024 at 4:15 PM

Joan Velja

@joanvelja.bsky.social

As a lefty, the only nitpick I have so far about Bluesky is the constant pop up menu coming from inadvertently stopping on someone’s pfp…

Research vibes are immaculate so far tho

November 19, 2024 at 2:10 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news