Jakub Łucki
@jakublucki.bsky.social
24 followers
38 following
11 posts
Visiting Researcher at NASA JPL | Data Science MSc at ETH Zurich
Posts
Media
Videos
Starter Packs
Pinned
Jakub Łucki
@jakublucki.bsky.social
· Dec 10
An Adversarial Perspective on Machine Unlearning for AI Safety
Large language models are finetuned to refuse questions about hazardous knowledge, but these protections can often be bypassed. Unlearning methods aim at completely removing hazardous capabilities fro...
arxiv.org
Jakub Łucki
@jakublucki.bsky.social
· Dec 10
Jakub Łucki
@jakublucki.bsky.social
· Dec 6
Jakub Łucki
@jakublucki.bsky.social
· Dec 6
An Adversarial Perspective on Machine Unlearning for AI Safety
Large language models are finetuned to refuse questions about hazardous knowledge, but these protections can often be bypassed. Unlearning methods aim at completely removing hazardous capabilities fro...
arxiv.org
Jakub Łucki
@jakublucki.bsky.social
· Dec 6
Jakub Łucki
@jakublucki.bsky.social
· Dec 6
Jakub Łucki
@jakublucki.bsky.social
· Dec 6