Egor Zverev
egorzverev.bsky.social
Egor Zverev
@egorzverev.bsky.social
ml safety researcher | visiting phd student @ETHZ | doing phd @ISTA | prev. @phystech | prev. developer @GSOC | love poetry
Reposted by Egor Zverev
✨ 𝗦𝘂𝗯𝗺𝗶𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼:
- Quick application
- Accepting posters for 2025 papers from top ML / Security venues
- 𝗗𝗲𝗮𝗱𝗹𝗶𝗻𝗲: October 28, 2025
- Notifications: October 31, 2025

Submission link: docs.google.com/forms/d/e/1F...

Workshop website: llmsafety-unconference.github.io
Submit a Paper for the ELLIS UnConference 2025 LLM Safety and Security workshop Poster Session
We’re hosting a poster session at the LLM Safety and Security Workshop (ELLIS UnConference) on December 2, 2025 in Copenhagen, Denmark. We invite attendees to present already published 2025 work in areas related to LLM safety and security. Eligibility. Posters must be based on papers accepted in 2025 at a top venue (or an associated LLM safety/security workshop). If your venue isn’t listed, please enter it manually. First-author or any co-author may present. Selection policy. We aim to accept as many posters as our space allows. If submissions exceed capacity, earlier submissions will be prioritized (first-come, first-served). Deadlines. Submission deadline: October 28, 2025 Notification: October 31, 2025
docs.google.com
October 9, 2025 at 2:16 PM
Reposted by Egor Zverev
📢 𝗖𝗮𝗹𝗹 𝗳𝗼𝗿 𝗣𝗼𝘀𝘁𝗲𝗿𝘀: 𝗟𝗟𝗠 𝗦𝗮𝗳𝗲𝘁𝘆 𝗮𝗻𝗱 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆 𝗪𝗼𝗿𝗸𝘀𝗵𝗼𝗽 @ 𝗘𝗟𝗟𝗜𝗦 𝗨𝗻𝗖𝗼𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲

📅 December 2, 2025
📍 Copenhagen

An opportunity to discuss your work with colleagues working on similar problems in LLM safety and security
October 9, 2025 at 2:16 PM
🎉 Excited to announce the Workshop on Foundations of LLM Security at #EurIPS2025!
🇩🇰 Dec 6–7, Copenhagen!
📢 Call for contributed talks is now open! See details at llmsec-eurips.github.io

#EurIPS @euripsconf.bsky.social @sahar-abdelnabi.bsky.social @aideenfay.bsky.social @thegruel.bsky.social
October 3, 2025 at 10:53 AM
Cool news: I have co-affiliated with @floriantramer.bsky.social at @ethz.ch through the #ELLIS PhD program! I will be visiting ETH for the next 3 months to work with @nkristina.bsky.social on LLM Agents Safety.
September 30, 2025 at 11:53 AM
Reposted by Egor Zverev
NeurIPS has decided to do what ICLR did: As a SAC I received the message 👇 This is wrong! If the review process cannot handle so many papers, the conference needs yo split instead of arbitrarily rejecting 400 papers.
August 28, 2025 at 4:12 PM
Reposted by Egor Zverev
Let's push for the obvious solution: Dear @neuripsconf.bsky.social ! Allow authors to present accepted papers at EurIPS instead of NeurIPS rather than just additionally. Likely, at least 500 papers would move to Copenhagen, problem solved.
August 28, 2025 at 7:19 PM
I will be attending #ACL2025NLP next week in Vienna 🇦🇹

Simply DM me if you want to chat about LLM Safety/Security, especially topics like instruction/data separation and instruction hierarchies.
July 25, 2025 at 12:22 PM
Reposted by Egor Zverev
Are you looking for an opportunity to do curiosity-driven basic ML research after your PhD? Look no further!
Apply for a postdoc position in my group at ISTA (ELLIS Unit Vienna)! Topics are flexible, as long as they fit to our general research group's interests, see
cvml.ista.ac.at/Postdoc-ML.h...
Machine Learning and Computer Vision Group -- Christoph Lampert -- ISTA
Computer Vision and Machine Learning, ISTA: open postdoc positions, Machine Learning, curiosity-driven, fully-funded
cvml.ista.ac.at
July 16, 2025 at 10:18 AM
Reposted by Egor Zverev
EurIPS is coming! 📣 Mark your calendar for Dec. 2-7, 2025 in Copenhagen 📅

EurIPS is a community-organized conference where you can present accepted NeurIPS 2025 papers, endorsed by @neuripsconf.bsky.social and @nordicair.bsky.social and is co-developed by @ellis.eu

eurips.cc
July 16, 2025 at 10:01 PM
🚀 We’ve released the source code for 𝗔𝗦𝗜𝗗𝗘 (presented as an 𝗢𝗿𝗮𝗹 at the #ICLR2025 BuildTrust workshop)!

🔍 ASIDE boosts prompt injection robustness without safety-tuning: we simply rotate embeddings of marked tokens by 90° during instruction-tuning and inference.

👇 code & docs👇
June 24, 2025 at 1:47 PM
Reposted by Egor Zverev
I’ll present our 𝗔𝗦𝗜𝗗𝗘 paper as an 𝗢𝗿𝗮𝗹 at the #ICLR2025 BuildTrust workshop! 🚀

✅ ASIDE = architecturally separating instructions and data in LLMs from layer 0
🔍 +12–44 pp↑ separation, no utility loss​​
📉 lowers prompt‑injection ASR (without safety tuning!)

🚀 Talk: Hall 4 #6, 28 Apr, 4:45
April 23, 2025 at 7:53 AM
Reposted by Egor Zverev
Tomorrow I am presenting"Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" at #ICLR2025!

Looking forward to fun discussions near the poster!

📆 Sat 26 Apr, 10:00-12:30 - Poster session 5 (#500)
(1/n) In our #ICLR2025 paper, we explore a fundamental issue that enables prompt injections: 𝐋𝐋𝐌𝐬’ 𝐢𝐧𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐭𝐨 𝐬𝐞𝐩𝐚𝐫𝐚𝐭𝐞 𝐢𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐢𝐨𝐧𝐬 𝐟𝐫𝐨𝐦 𝐝𝐚𝐭𝐚 𝐢𝐧 𝐭𝐡𝐞𝐢𝐫 𝐢𝐧𝐩𝐮𝐭.

✅ Definition of separation
👉 SEP Benchmark
🔍 LLM evals on SEP
April 25, 2025 at 4:18 AM
Tomorrow I am presenting"Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" at #ICLR2025!

Looking forward to fun discussions near the poster!

📆 Sat 26 Apr, 10:00-12:30 - Poster session 5 (#500)
(1/n) In our #ICLR2025 paper, we explore a fundamental issue that enables prompt injections: 𝐋𝐋𝐌𝐬’ 𝐢𝐧𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐭𝐨 𝐬𝐞𝐩𝐚𝐫𝐚𝐭𝐞 𝐢𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐢𝐨𝐧𝐬 𝐟𝐫𝐨𝐦 𝐝𝐚𝐭𝐚 𝐢𝐧 𝐭𝐡𝐞𝐢𝐫 𝐢𝐧𝐩𝐮𝐭.

✅ Definition of separation
👉 SEP Benchmark
🔍 LLM evals on SEP
April 25, 2025 at 4:18 AM
I’ll present our 𝗔𝗦𝗜𝗗𝗘 paper as an 𝗢𝗿𝗮𝗹 at the #ICLR2025 BuildTrust workshop! 🚀

✅ ASIDE = architecturally separating instructions and data in LLMs from layer 0
🔍 +12–44 pp↑ separation, no utility loss​​
📉 lowers prompt‑injection ASR (without safety tuning!)

🚀 Talk: Hall 4 #6, 28 Apr, 4:45
April 23, 2025 at 7:53 AM
Landing in Singapore for #ICLR2025 next week! DM me for a 1‑on‑1 about LLM safety, building safe LLMs by design, control and data flows, instruction–data separation and hierarchies.

I’m presenting our instruction–data separation paper plus a workshop paper—long post coming.
April 18, 2025 at 2:39 PM
(1/n) In our #ICLR2025 paper, we explore a fundamental issue that enables prompt injections: 𝐋𝐋𝐌𝐬’ 𝐢𝐧𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐭𝐨 𝐬𝐞𝐩𝐚𝐫𝐚𝐭𝐞 𝐢𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐢𝐨𝐧𝐬 𝐟𝐫𝐨𝐦 𝐝𝐚𝐭𝐚 𝐢𝐧 𝐭𝐡𝐞𝐢𝐫 𝐢𝐧𝐩𝐮𝐭.

✅ Definition of separation
👉 SEP Benchmark
🔍 LLM evals on SEP
March 18, 2025 at 2:47 PM