Lightnews — Scholar-powered news

@craigmacdonald.bsky.social

470 followers 19 following 11 posts

Posts Replies Media Videos

craigmacdonald.bsky.social

@craigmacdonald.bsky.social

6/7. Another cool addition by
@macavaney.bsky.social: integrating PyTerrier extension docs into pyterrier.readthedocs.io, including as pyterrier_dr (dense retrieval), pyterrier_doc2query, pyterrier_pisa etc. One place now has a full list of lots of SOTA extensions for retrieval.

December 19, 2024 at 1:14 PM

craigmacdonald.bsky.social

@craigmacdonald.bsky.social

5/7.
@macavaney.bsky.social
has been polishing pyterrier_dr (github.com/terrierteam/...) – our single-vector Dense Retrieval framework, for instance including PRF:

December 19, 2024 at 1:14 PM

craigmacdonald.bsky.social

@craigmacdonald.bsky.social

4/7. A complete rework of pipeline compilation by
@macavaney.bsky.social. PyTerrier compilation rewrites a pipeline by, e.g. applying rank cutoff earlier. So these two pipelines are equivalent, but potentially faster – .compile() allows that optimisation to happen automatically

December 19, 2024 at 1:14 PM

craigmacdonald.bsky.social

@craigmacdonald.bsky.social

3/7. Precomputation of common pipeline prefixes is something really 😎. For this example experiment comparing BM25 with BM25 >> monoT5, this means that only one BM25 retrieval is needed to evaluate both pipelines. Great for speeding up comparative experiments on large query sets!

December 19, 2024 at 1:14 PM

craigmacdonald.bsky.social

@craigmacdonald.bsky.social

2/7. Making transform_iter() into a first-class citizen of pt.Transformer – no need to manipulate dataframes – you can now write transformers that operate on a list of dicts. This (backwards compatible) change is big news as PyTerrier has been dataframe based from the outset.

December 19, 2024 at 1:14 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news