Marcin Junczys-Dowmunt (Marian NMT)
@marian-nmt.bsky.social
160 followers
130 following
16 posts
NLP. NMT. Main author of Marian NMT. Research Scientist at Microsoft Translator.
https://marian-nmt.github.io
Posts
Media
Videos
Starter Packs
Pinned
Reposted by Marcin Junczys-Dowmunt (Marian NMT)
Reposted by Marcin Junczys-Dowmunt (Marian NMT)
Reposted by Marcin Junczys-Dowmunt (Marian NMT)
Reposted by Marcin Junczys-Dowmunt (Marian NMT)
Yoav Goldberg
@yoavgo.bsky.social
· Dec 13
Frontier Models are Capable of In-context Scheming
Frontier models are increasingly trained and deployed as autonomous agent. One safety concern is that AI agents might covertly pursue misaligned goals, hiding their true capabilities and objectives - ...
arxiv.org
Reposted by Marcin Junczys-Dowmunt (Marian NMT)