Julius Cheng
@juliuscheng.bsky.social
24 followers 30 following 7 posts
Finishing up PhD in NLP at University of Cambridge. Deciding whether to put my weirdo ML thoughts on here or just be normal
Posts Media Videos Starter Packs
juliuscheng.bsky.social
They say it's something like 20-30% but 0% of my papers get accepted!! Something definitely wrong here
Reposted by Julius Cheng
zouharvi.bsky.social
The paper comes with mysterious fig 1. 👁️
juliuscheng.bsky.social
Our experiments are on machine translation, but this method works with any generator + reranker setup!

Eager to hear your thoughts, and happy reranking!
juliuscheng.bsky.social
Bonus: we show how to use multi-fidelity Bayesian optimization to use a smaller and faster proxy scoring model to search even more efficiently. We get the best performance by training a distilled model from our main CometKiwi model.
juliuscheng.bsky.social
The candidate pool is actually a search space, and you can model your uncertainty about scores you haven't scored yet with GP regression. Use BayesOpt to search the pool for promising candidates.

This nearly gets the maximum achievable score with only 70/200 scoring calls!
juliuscheng.bsky.social
Reranking is expensive and we show that you don't need to score every candidate in the candidate pool.

Use Bayesian optimization with GPs!
juliuscheng.bsky.social
Language models for MT are good at generating large candidate pools that contain good translations; they're less good at assigning the highest score to the best translation.

This is where reranking comes in: rescoring with COMET, noisy channel decoding, minimum Bayes risk, etc.
juliuscheng.bsky.social
Happy to announce that our work "A Bayesian Optimization Approach to Machine Translation" was accepted to NAACL 2025!

Special thanks to @ufal-cuni.bsky.social for organizing MT Marathon 2025 where I was able to team up with @maikezufle.bsky.social and @zouharvi.bsky.social !

Explainer below: