Lightnews — Scholar-powered news

Andrew Drozdov @mrdrozdov.com · Aug 19

We built a thing! The Databricks Reranker is now in Public Preview. It's as easy as changing the arguments to your vector search call, and doesn't require any additional setup.

Read more: www.databricks.com/blog/reranki...

Reranking in Mosaic AI Vector Search for Faster, Smarter Retrieval in RAG Agents

Boost RAG agent quality with reranking—deliver more relevant answers in less time with a single parameter in Mosaic AI Vector Search.

www.databricks.com

1 2

Reposted by Andrew Drozdov

Mark Riedl @markriedl.bsky.social · Apr 12

The transformer was invented in Google. RLHF was not invented in industry labs, but came to prominence in OpenAI and DeepMind. I took 5 of the most influential papers (black dots) and visualized their references. Blue dots are papers that acknowledge federal funding (DARPA, NSF).

2 24 110

Reposted by Andrew Drozdov

Florina Piroi @flrp.bsky.social · Apr 11

LongEval is turning three this year!

This is a Call for Participation to our CLEF 2025 Lab - try out how your IR system does in the long term.

Check the details on our page:
clef-longeval.github.io

LongEval 2025

Conference Template

clef-longeval.github.io

3 8

Andrew Drozdov @mrdrozdov.com · Apr 13

The PhD is pretraining. Interview prep is alignment. Take this to heart. :)

2

Reposted by Andrew Drozdov

Marzena Karpinska ✈️ COLM'25 @markar.bsky.social · Apr 2

We have updated #nocha, a leaderboard for reasoning over long-context narratives 📖, with some new models including #Gemini 2.5 Pro which shows massive improvements over the previous version! Congrats to #Gemini team 🪄 🧙 Check 🔗 novelchallenge.github.io for details :)

Leaderboard showing performance of language models on claim verification task over book-length input. o1-preview is the best model with 67.36% accuracy followed by Gemini 2.5 Pro with 64.17% accuracy.

4 11

Andrew Drozdov @mrdrozdov.com · Mar 28

I think ARR used to do this? Seems like it’s missing in the recent cycle(s).

stats.aclrollingreview.org/iterations/2...

ARR Dashboard

stats.aclrollingreview.org

3

Andrew Drozdov @mrdrozdov.com · Mar 22

A corollary here is that a relevant context might not improve the probability of the right answer.

Andrew Drozdov @mrdrozdov.com · Mar 22

Perhaps the most misunderstood aspect of retrieval: For a context to be relevant, it is not enough for it to improve the probability of the right answer.

1 1

Reposted by Andrew Drozdov

Daniel Liden @danliden.com · Mar 14

MLflow is on BlueSky! Follow @mlflow.org to keep up to date on new releases, blogs and tutorials, events, and more.

bsky.app

1 4

Andrew Drozdov @mrdrozdov.com · Mar 12

ris.utwente.nl/ws/portalfil...

ris.utwente.nl

Andrew Drozdov @mrdrozdov.com · Mar 12

---Born To Add, Sesame Street
---(sung to the tune of Bruce Springsteen’s Born to Run)

1

Andrew Drozdov @mrdrozdov.com · Mar 12

One, and two, and three police persons spring out of the shadows
Down the corner comes one more
And we scream into that city night: “three plus one makes four!”
Well, they seem to think we’re disturbing the peace
But we won’t let them make us sad
’Cause kids like you and me baby, we were born to add

1

Andrew Drozdov @mrdrozdov.com · Mar 11

"How Claude Code is using a 50-Year-Old trick to revolutionize programming"

Andrew Drozdov @mrdrozdov.com · Mar 11

Somehow my most controversial take of 2025 is that agents relying on grep are a form of RAG.

2

Andrew Drozdov @mrdrozdov.com · Mar 11

Somehow my most controversial take of 2025 is that agents relying on grep are a form of RAG.

2

Andrew Drozdov @mrdrozdov.com · Feb 26

Embedding finetuning is not a new idea, but it's still overlooked IMO.

The promptagator work is one of the more impactful papers that show finetuning with synthetic data is effective.

arxiv.org/abs/2209.11755

Promptagator: Few-shot Dense Retrieval From 8 Examples

Much recent research on information retrieval has focused on how to transfer from one task (typically with abundant supervised data) to various other tasks where supervision is limited, with the impli...

arxiv.org

2

Andrew Drozdov @mrdrozdov.com · Feb 26

Search is the key to building trustworthy AI and will only be more important as we build more ambitious applications. With that in mind, there's not nearly enough energy spent improving the quality of search systems.

Follow the link for the full episode:
www.linkedin.com/posts/data-b...

Data Brew by Databricks on LinkedIn: Join us on the latest Data Brew episode for a deep dive on Retrieval…

Join us on the latest Data Brew episode for a deep dive on Retrieval, rerankers, and RAG tips and tricks with our very own Andrew Drozdov, Research Scientist…

www.linkedin.com

3

Andrew Drozdov @mrdrozdov.com · Feb 26

It was a real pleasure talking about effective IR approaches with Brooke and Denny on the Data Brew podcast.

Among other things, I'm excited about embedding finetuning and reranking as modular ways to improve RAG pipelines. Everyone should use these more!

1 7

Andrew Drozdov @mrdrozdov.com · Feb 26

We're probably a little too obsessed with zero-shot retrieval. If you have documents (you do), then you can generate synthetic data, and finetune your embedding. Blog post lead by @jacobianneuro.bsky.social shows how well this works in practice.

www.databricks.com/blog/improvi...

Improving Retrieval and RAG with Embedding Model Finetuning

Fine-tune embedding models on Databricks to enhance retrieval and RAG accuracy with synthetic data—no manual labeling required.

www.databricks.com

1 5 8

Andrew Drozdov @mrdrozdov.com · Feb 1

I do want to see aggregate stats about the model’s generation and total reasoning tokens is perhaps the least informative one.

2

Andrew Drozdov @mrdrozdov.com · Jan 26

"All you need to build a strong reasoning model is the right data mix."

The pipeline that creates the data mix:

1 1 13

Andrew Drozdov @mrdrozdov.com · Jan 23

After frequent road runs during a Finland visit I tend to feel the same

3

Andrew Drozdov @mrdrozdov.com · Jan 22

Using 100+ tokens to answer 2 + 3 =

1 19

Andrew Drozdov @mrdrozdov.com · Dec 27

It’s pretty obvious we’re in a local minima for pretraining. Would expect more breakthroughs in the 5-10 year range. Granted, it’s still incredibly hard and expensive to do good research in this space, despite the number of labs working on it.

kyunghyuncho.bsky.social @kyunghyuncho.bsky.social · Dec 26

"the gap between OAI/Anthropic/Meta/etc. and a large group of companies all over the world you've never cared to know of, in terms of LM pre-training? tiny" - 💡 me (Nov 2, 2024)

1 10

Reposted by Andrew Drozdov

Susie Dent @susiedent.com · Dec 23

Word of the day (of course) is ‘scurryfunging’, from US dialect: the frantic attempt to tidy the house just before guests arrive.

110 570 3.3K

Reposted by Andrew Drozdov

kyunghyuncho.bsky.social @kyunghyuncho.bsky.social · Dec 21

... didn't know this would be one of the hottest takes i've had ...

for more on my thoughts, see drive.google.com/file/d/1sk_t...

3 7 51