srinathnamburi.bsky.social
@srinathnamburi.bsky.social
Reposted
Online data mixing reduces training costs for foundation models, but faces challenges:
⚠️ Human-defined domains miss semantic nuances
⚠️ Limited eval accessibility
⚠️ Poor scalability

Introducing 🎵R&B: first regroup data, then dynamically reweight domains during training!
May 8, 2025 at 5:01 PM
Reposted
Today at @iclr-conf.bsky.social, come chat with @changho.bsky.social about what types of data drive weak-to-strong generalization!
April 23, 2025 at 8:29 PM
Reposted
First up at #NeurIPS2024 from our group, our work on labeling via programmatic distillation (a spotlight!). Label your data orders of magnitude faster and cheaper — come join us today at Poster Session 2 East for a demo!
December 11, 2024 at 11:15 PM
Reposted
Excited to present Colander at #NeurIPS2024, our new framework for optimizing confidence functions to make auto-labeling more efficient and reliable. Check out our poster #1906 at today's evening poster session.

Wed, Dec 11, 4:30–7:30 p Poster #1906

Project: harit7.github.io/colander
December 11, 2024 at 5:53 PM