Lightnews — Scholar-powered news

Michael Günther

@michael-g-u.bsky.social

53 followers 150 following 21 posts

ML @jina-ai.bsky.social
https://github.com/guenthermi

Posts Replies Media Videos

Michael Günther

@michael-g-u.bsky.social

Some examples for instructions:
- How to translate named entities and technical terms (e.g., "Big Data," "Embeddings")
- Specifying date formats (MM/DD/YY, DD/MM/YY, YYYY-MM-DD)
- Define tone of a text (e.g., formal vs informal)
Nevertheless LLM's latency might be much higher.

January 26, 2025 at 4:28 PM

Michael Günther

@michael-g-u.bsky.social

Whether to use late chunking also depends on the chunk size, for smaller chunks late chunking is generally more useful than for large chunk sizes.

December 5, 2024 at 8:49 AM

Michael Günther

@michael-g-u.bsky.social

Chunking improves the performance for fact retrieval task but can actually harm the performance for other retrieval tasks. Late chunking is useful for coherent datasets and often a good compromise to help embeddings to retain context information but also to focus on details:

December 5, 2024 at 8:49 AM

Michael Günther

@michael-g-u.bsky.social

First, more input helps, but not for all retrieval tasks equally:

December 5, 2024 at 8:49 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news