Yuda Song
@yus167.bsky.social
1.3K followers
190 following
12 posts
PhD at Machine Learning Department, Carnegie Mellon University | Interactive Decision Making | https://yudasong.github.io
Posts
Media
Videos
Starter Packs
Reposted by Yuda Song
Reposted by Yuda Song
Reposted by Yuda Song
Reposted by Yuda Song
Yuda Song
@yus167.bsky.social
· Dec 9
The Importance of Online Data: Understanding Preference Fine-tuning via Coverage
Learning from human preference data has emerged as the dominant paradigm for fine-tuning large language models (LLMs). The two most common families of techniques -- online reinforcement learning (RL) ...
arxiv.org
Yuda Song
@yus167.bsky.social
· Dec 6
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
Self-improvement is a mechanism in Large Language Model (LLM) pre-training, post-training and test-time inference. We explore a framework where the model verifies its own outputs, filters or reweights...
arxiv.org
Yuda Song
@yus167.bsky.social
· Dec 6
Yuda Song
@yus167.bsky.social
· Dec 6
Yuda Song
@yus167.bsky.social
· Dec 6
Reposted by Yuda Song
arxiv cs.CL
@arxiv-cs-cl.bsky.social
· Dec 4
Reposted by Yuda Song
Gokul Swamy
@gokul.dev
· Nov 22
Reposted by Yuda Song
Reposted by Yuda Song
Steph Milani
@stephmilani.bsky.social
· Nov 18
Reposted by Yuda Song
arxiv stat.ML
@arxiv-stat-ml.bsky.social
· Nov 22
Reposted by Yuda Song
Reposted by Yuda Song