Yinglun Zhu
yinglunzhu.bsky.social
Yinglun Zhu
@yinglunzhu.bsky.social
Assistant Prof @ UC Riverside. Research on Efficient ML, RL, and LLMs. CS PhD @ UW Madison.

yinglunz.com
TTM can also be extended to datasets without local groups -- by treating the entire dataset as a global assignment problem between all images and captions (solved in polynomial time).

The global TTM variant achieves up to 33.3% relative error reduction.
October 31, 2025 at 6:03 PM
TTM isn’t limited to benchmarks with k-by-k groups.

For 1-by-k groups, GroupMatch = GroupScore, so metric change brings no benefit. Yet, TTM still delivers substantial improvements -- up to 85.7% -- on datasets such as SugarCrepe and WhatsUp.
October 31, 2025 at 6:03 PM
TTM provides substantial improvements on top of SimpleMatch, without external supervision.

Remarkably, TTM enables SigLIP-B16 (~ 0.2B params) to surpass GPT-4.1 on MMVP-VLM.

Shout out to the awesome authors behind SigLIP! @giffmana.ai @xzhai.bsky.social @kolesnikov.ch and Basil Mustafa
October 31, 2025 at 6:03 PM
To push further, we develop Test-Time Matching (TTM), an iterative, self-improving algorithm with two key components:

(i) GroupMatch-based pseudo-labels for stronger supervision.
(ii) A progressively decaying selection threshold schedule to gradually expand coverage across the test set.
October 31, 2025 at 6:03 PM
SimpleMatch reveals substantial hidden capability -- it enables SigLIP-B16 to surpass all prior results and GPT-4.1 to achieve the first result surpassing human performance on Winoground.
October 31, 2025 at 6:03 PM
Super excited to share Test-Time Matching (TTM), an iterative, self-improving algorithm that unlocks substantial compositional reasoning capabilities in multimodal models.

TTM enables SigLIP-B16 (~0.2B params) to outperform GPT-4.1 on MMVP-VLM, establishing a new SOTA.
October 31, 2025 at 6:03 PM
🚀Excited to share our new paper:

Online Finetuning Decision Transformers with Pure RL Gradients

RL drives reasoning in LLMs—but remains underexplored for online finetuning of Decision Transformers (DTs), where most methods still rely mainly on supervised objectives.

Why?
October 14, 2025 at 7:04 PM
Sharing new paper: Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data

We extend classical unimodal active learning to the multimodal AL with unaligned data, allowing data-efficient finetuning and pretraining of vision-language models such as CLIP and SigLIP.

1/3
October 10, 2025 at 6:03 PM
Most methods allocate compute uniformly, ignoring variation in query difficulty.

We propose adaptive algorithms that estimate query difficulty on the fly and allocate compute strategically—just enough for easy queries and more for hard ones.

📊 Example (avg. budget = 32):

(2/3)
July 1, 2025 at 6:45 PM
🚀Excited to share our new paper: Strategic Scaling of Test-Time Compute: A Bandit Learning Approach.

We turn test-time compute allocation into a bandit learning problem, achieving:
✅ +11.10% on MATH-500
✅ +7.41% on LiveCodeBench

Paper: arxiv.org/pdf/2506.12721

(1/3)
July 1, 2025 at 6:45 PM
I’m recruiting multiple PhD students for Fall 2025 at UCR! If you’re interested in working on efficient ML, RL, and LLMs, please apply to the UCR CS/EE PhD program.

Please visit yinglunz.com for detailed information on research directions and contact instructions.
December 3, 2024 at 7:52 PM