Machine Learning Engineer at 🤗 Hugging Face
In short: your retrieval doesn't need to be so expensive!
In short: your retrieval doesn't need to be so expensive!
🧵
🧵
Binary retrieval with int8 rescoring costs just ~6GB of RAM and ~45GB of disk space for embeddings.
🧵
Binary retrieval with int8 rescoring costs just ~6GB of RAM and ~45GB of disk space for embeddings.
🧵
🧵
🧵
- A binary index, I used a IndexBinaryFlat for exact and IndexBinaryIVF for approximate
- A int8 "view", i.e. a way to load the int8 embeddings from disk efficiently given a document ID
🧵
- A binary index, I used a IndexBinaryFlat for exact and IndexBinaryIVF for approximate
- A int8 "view", i.e. a way to load the int8 embeddings from disk efficiently given a document ID
🧵
5. Rescore the top 40 documents using the fp32 query embedding and the 40 int8 embeddings
6. Sort the 40 documents based on the new scores, grab the top 10
7. Load the titles/texts of the top 10 documents
🧵
5. Rescore the top 40 documents using the fp32 query embedding and the 40 int8 embeddings
6. Sort the 40 documents based on the new scores, grab the top 10
7. Load the titles/texts of the top 10 documents
🧵
1. Embed your query using a dense embedding model into a 'standard' fp32 embedding
2. Quantize the fp32 embedding to binary: 32x smaller
3. Use an approximate (or exact) binary index to retrieve e.g. 40 documents (~20x faster than a fp32 index)
🧵
1. Embed your query using a dense embedding model into a 'standard' fp32 embedding
2. Quantize the fp32 embedding to binary: 32x smaller
3. Use an approximate (or exact) binary index to retrieve e.g. 40 documents (~20x faster than a fp32 index)
🧵
🧵
🧵
And don't forget to ⭐ the project if you haven't yet.
🧵
And don't forget to ⭐ the project if you haven't yet.
🧵
Now that Python 3.9 has lost security support, Sentence Transformers no longer supports it.
🧵
Now that Python 3.9 has lost security support, Sentence Transformers no longer supports it.
🧵
This release works with both Transformers v4 and the upcoming v5. In the future, Sentence Transformers will only work with Transformers v5, but not yet!
Even my tests run on both Transformers v4 and v5.
🧵
This release works with both Transformers v4 and the upcoming v5. In the future, Sentence Transformers will only work with Transformers v5, but not yet!
Even my tests run on both Transformers v4 and v5.
🧵
When mining for hard negatives to create a strong training dataset, you can now pass `output_scores=True` to get similarity scores returned. This can be useful for some distillation losses!
🧵
When mining for hard negatives to create a strong training dataset, you can now pass `output_scores=True` to get similarity scores returned. This can be useful for some distillation losses!
🧵
You can now use community translations of the tiny NanoBEIR retrieval benchmark instead of only the English one, by passing `dataset_id`, e.g. `dataset_id="lightonai/NanoBEIR-de"` for the German benchmark.
🧵
You can now use community translations of the tiny NanoBEIR retrieval benchmark instead of only the English one, by passing `dataset_id`, e.g. `dataset_id="lightonai/NanoBEIR-de"` for the German benchmark.
🧵
Similar to SentenceTransformer and SparseEncoder, you can now use multi-processing with CrossEncoder rerankers. Useful for multi-GPU and CPU settings, and simple to configure:
just `device=["cuda:0", "cuda:1"]` or `device=["cpu"]*4` on the `predict`/`rank` calls.
🧵
Similar to SentenceTransformer and SparseEncoder, you can now use multi-processing with CrossEncoder rerankers. Useful for multi-GPU and CPU settings, and simple to configure:
just `device=["cuda:0", "cuda:1"]` or `device=["cpu"]*4` on the `predict`/`rank` calls.
🧵
I'm very excited about the future of the project, and for the world of embeddings and retrieval at large!
I'm very excited about the future of the project, and for the world of embeddings and retrieval at large!
🧵
🧵
🧵
🧵
Sentence Transformers will remain a community-driven, open-source project, with the same Apache 2.0 license as before. Contributions from researchers, developers, and enthusiasts are welcome and encouraged!
🧵
Sentence Transformers will remain a community-driven, open-source project, with the same Apache 2.0 license as before. Contributions from researchers, developers, and enthusiasts are welcome and encouraged!
🧵
🧵
🧵