Zilei Shao
@zoeshao.bsky.social
28 followers 25 following 3 posts
First-year Ph.D. Student @ StarAI Lab, UCLA Harvey Mudd College ‘24
Posts Media Videos Starter Packs
zoeshao.bsky.social
Check it out at our website advtok.github.io to access all the details! We already released the paper, code (wrapped in a package), and the blog.

We’d love to hear your thoughts!
Adversarial Tokenization
Adversarial Tokenization
advtok.github.io
zoeshao.bsky.social
What happens if we tokenize cat as [ca, t] rather than [cat]?

LLMs are trained on just one tokenization per word, but they still understand alternative tokenizations. We show that this can be exploited to bypass safety filters without changing the text itself.

#AI #LLMs #tokenization #alignment