Lightnews — Scholar-powered news

Orion Weller

@orionweller.bsky.social

2.3K followers 330 following 22 posts

PhD Student at Johns Hopkins University. Previously: Allen Institute for AI, Apple, Samaya AI. Research for #NLProc #IR

Posts Replies Media Videos

Orion Weller

@orionweller.bsky.social

Ever wonder how test-time compute would do in retrieval? 🤔

introducing ✨rank1✨

rank1 is distilled from R1 & designed for reranking.

rank1 is state-of-the-art at complex reranking tasks in reasoning, instruction-following, and general semantics (often 2x RankLlama 🤯)

🧵

February 26, 2025 at 2:57 PM

Reposted by Orion Weller

Kenneth Enevoldsen

@kennethenevoldsen.bsky.social

We use this collection of tasks to propose multiple benchmarks for multilingual, code, European and Indic languages, and many more.

We find that smaller multilingual models (~500M) outperform notably larger 7B models, likely due to a limited multilingual pre-training.

February 20, 2025 at 9:57 AM

Orion Weller

@orionweller.bsky.social

Check out our new encoder model, ModernBERT! 🤖

Super grateful to have been part of such an awesome team effort and very excited about the gains for retrieval/RAG! 🚀

Jeremy Howard @howard.fm · Dec 19

I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧵

December 19, 2024 at 9:28 PM

Orion Weller

@orionweller.bsky.social

MASC is such a fun time! If your university is in the mid-Atlantic, please consider hosting!

masc-conference.bsky.social @masc-conference.bsky.social · Dec 16

📢 Want to host MASC 2025?

The 12th Mid-Atlantic Student Colloquium is a one day event bringing together students, faculty and researchers from universities and industry in the Mid-Atlantic.

Please submit this very short form if you are interested in hosting! Deadline January 6th. #MASC2025

December 16, 2024 at 9:25 PM

Reposted by Orion Weller

Tom Aarsen

@tomaarsen.com

I'm looking for an intern to introduce Sparse Embedding models to Sentence Transformers! If you're passionate about open source, interested in helping practitioners use your tools, and enjoy embedders/retrievers/rerankers, then I'd love to hear from you!

Links with details and to apply in 🧵

November 27, 2024 at 2:31 PM

Reposted by Orion Weller

Marc Marone

@marcmarone.com

I noticed a lot of starter packs skewed towards faculty/industry, so I made one of just NLP & ML students: go.bsky.app/vju2ux

Students do different research, go on the job market, and recruit other students. Ping me and I'll add you!

November 23, 2024 at 7:54 PM

Orion Weller

@orionweller.bsky.social

Creating a 🦋 starter pack for people working in IR/RAG: go.bsky.app/88ULgwY

I can’t seem to find everyone though, help definitely appreciated to fill this out (DM or comment)!

November 23, 2024 at 9:19 PM

Orion Weller

@orionweller.bsky.social

Using LLMs for query or document expansion in retrieval (e.g. HyDE and Doc2Query) have scores going 📈

But do these approaches work for all IR models and for different types of distribution shifts? Turns out its actually more 📉 🚨

📝 (arxiv soon): orionweller.github.io/assets/pdf/L...

A plot: the x axis is baseline score of rankers, in ndcg@10. y axis is delta of model score after an expansion is applied.

There are three sets of results, one dataset for each shift type: TrecDL (no shift), FiQA (domain shift), ArguAna (query shift). For each set of result, the chart shows a scatter plot with a trend line. We observe the same trend for all: as the baseline score increases, the delta when using expansion decreases.

On TREC DL, worst models have a base score of ~40, and improve by 10 points w/expansion. the best models have a score of >70, and their performance decreases by -5 points w/expansion.

On FiQA, worse models have a base score of ~15, and improve by 5 points w/expansion. the best models have a score of ~45, and their performance decreases by -3 point w/expansion.

On ArguAna, worst models have a base score of ~25, and improve by >20 points w/expansion. the best models have a score of >55, and their performance decreases by -1 point w/expansion.

November 18, 2024 at 10:30 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news