sadly, we didn't see any gains from adding more tokens. Perhaps that's because reranking isn't as hard and doesn't need many tokens. Maybe that's because we didn't use the right incantation 🤷♂️
Try it out yourself!
sadly, we didn't see any gains from adding more tokens. Perhaps that's because reranking isn't as hard and doesn't need many tokens. Maybe that's because we didn't use the right incantation 🤷♂️
Try it out yourself!
- it's simple to train
- it doesn't require much data (and isn't overfit)
- it generalizes insanely well
- it just thinks different than all other rerankers
There's so much you could do, we've just barely started!
- it's simple to train
- it doesn't require much data (and isn't overfit)
- it generalizes insanely well
- it just thinks different than all other rerankers
There's so much you could do, we've just barely started!
We quantized each of our models: rank1-32b can fit on a 24GB gpu while maintaining nearly all its performance 🚀 🚀
We quantized each of our models: rank1-32b can fit on a 24GB gpu while maintaining nearly all its performance 🚀 🚀
Turns out, nearly all of them were relevant docs - other systems just missed them!
Turns out, nearly all of them were relevant docs - other systems just missed them!
They even are SOTA in multilingual instruction following, despite using no multilingual IR data 🤯
They even are SOTA in multilingual instruction following, despite using no multilingual IR data 🤯
Our data (600k) and models are open source, check them out: huggingface.co/collections/...
📝: arxiv.org/abs/2502.18418
Keep reading to see what surprised us 😮
Our data (600k) and models are open source, check them out: huggingface.co/collections/...
📝: arxiv.org/abs/2502.18418
Keep reading to see what surprised us 😮
We find that smaller multilingual models (~500M) outperform notably larger 7B models, likely due to a limited multilingual pre-training.
We find that smaller multilingual models (~500M) outperform notably larger 7B models, likely due to a limited multilingual pre-training.
I've even been looking at how it works for instruction-based retrieval and turns out that having modern data helps a lot 🔥
Excited to see what you do with it!
I've even been looking at how it works for instruction-based retrieval and turns out that having modern data helps a lot 🔥
Excited to see what you do with it!
But thanks for the shoutout @mrdrozdov.com! Definitely still relevant.
There was a cool follow up from Google as well (don’t know if the authors are on 🦋): arxiv.org/pdf/2311.09175
But thanks for the shoutout @mrdrozdov.com! Definitely still relevant.
There was a cool follow up from Google as well (don’t know if the authors are on 🦋): arxiv.org/pdf/2311.09175
Interestingly, these effects are the least strong on long query shift (e.g. paragraph+ sized queries, a la ArguAna).
Interestingly, these effects are the least strong on long query shift (e.g. paragraph+ sized queries, a la ArguAna).
It turns out there's a strong and consistent negative correlation between model performance and gains from using expansion. And it holds for all 20+ rankers we tested!
It turns out there's a strong and consistent negative correlation between model performance and gains from using expansion. And it holds for all 20+ rankers we tested!