Lightnews — Scholar-powered news

Isaac Flath

@isaac-flath.bsky.social

Yes, though this is a saying because the last mile/last 10% taking lots of time is not new with vibe coding. So getting that first 90% in an hour is a massive win!

January 23, 2026 at 9:11 PM

Isaac Flath

@isaac-flath.bsky.social

Link to the course is here:
bit.ly/ai-coding-c...

Elite AI Assisted Coding by Eleanor Berger and Isaac Flath on Maven

Make it code like you do. Turn generic AI assistants into coding partners that actually get your style and have the right context.

bit.ly

December 22, 2025 at 1:54 AM

Isaac Flath

@isaac-flath.bsky.social

For a detailed write-up, and the full recording of this talk, go here!

elite-ai-assisted-coding.dev/p/mgrep-wit...

mgrep with Founding Engineer Rui Huang

The Problem with grep for AI Agents

elite-ai-assisted-coding.dev

December 10, 2025 at 6:28 PM

Isaac Flath

@isaac-flath.bsky.social

Unlike traditional RAG that vectorizes large chunks of text, @mixedbreadai 's engine represents every single word as its own vector.

This provides much more granular and accurate results.

Join @aaxsh18 tomorrow for research details on how it works: maven.com/p/0c0eed/mo...

Modern Multi-Vector Code Search

Mixedbread showed in their launch how much faster/better semantic search could make Claude code. Cursor also just announced semantic embedding support, and other agents are soon to follow. I got better quality with 1/2 tokens and almost 2x faster with mixedbread search. Learn what changed from the leading researchers in the space.

maven.com

December 10, 2025 at 6:28 PM

Isaac Flath

@isaac-flath.bsky.social

mgrep is also multimodal. It can natively index and search images, diagrams, and PDFs in your repository.

An agent can find relevant information in visual assets that are completely invisible to text-only tools

Very useful for legal, e-commerce and many other domains

And cats

December 10, 2025 at 6:28 PM

Isaac Flath

@isaac-flath.bsky.social

The results from their internal tests with Claude are significant. Using mgrep led to:

🤌 53% fewer tokens used
🚀 48% faster response
💯 3.2x better quality

By getting the right context immediately, agents stay on track. I saw similar results.

elite-ai-assisted-coding.dev/p/boosting-...

Boosting Claude: Faster, Clearer Code Analysis with MGrep

I ran an experiment to see how a powerful search tool could improve an LLM’s ability to understand a codebase.

elite-ai-assisted-coding.dev

December 10, 2025 at 6:28 PM

Isaac Flath

@isaac-flath.bsky.social

Agents use mgrep for broad, semantic exploration and grep for precise symbol lookups

Instead of grep commands guessing at keywords, an agent makes a semantic query

mgrep "how is auth implemented?"

It then uses grep for precise function/class name searches.

No guessing 😁

December 10, 2025 at 6:28 PM

Isaac Flath

@isaac-flath.bsky.social

mgrep is a command-line tool that brings semantic search to your codebase, letting agents search by intent, not just keywords.

It's much faster than grep alone, and works much better than traditional semantic search.

December 10, 2025 at 6:28 PM

Isaac Flath

@isaac-flath.bsky.social

For all the code, charts, and a deeper dive into the mechanics, check out the full blog post

isaacflath.com/blog/2025-1...

Quantization Fundamentals for Multi-Vector Retrieval - Blog

An thorough and complete introduction to Quantization for Multi-Vector Search Architectures

isaacflath.com

December 1, 2025 at 7:13 PM

Isaac Flath

@isaac-flath.bsky.social

This two-stage compression (PQ + residual quant) means you get token-level understanding in fraction of the space

The Mixedbread team has 2 free eng and research talks soon for more info on research and how to use

Talk 1: maven.com/p/9c51af/th...

Talk 2: maven.com/p/0c0eed/mo...

Modern Multi-Vector Code Search

Mixedbread showed in their launch how much faster/better semantic search could make Claude code. Cursor also just announced semantic embedding support, and other agents are soon to follow. I got better quality with 1/2 tokens and almost 2x faster with mixedbread search. Learn what changed from the leading researchers in the space.

maven.com

December 1, 2025 at 7:13 PM

Isaac Flath

@isaac-flath.bsky.social

Step 5: Extreme Quantization

ColBERT goes one step further. After PQ, it calculates the "residual" error (the small difference between the original and the approximation). Then, it quantizes that error, often down to just 1 or 2 bits per value!

December 1, 2025 at 7:13 PM

Isaac Flath

@isaac-flath.bsky.social

Step 3: Store which cluster/centroid each piece belongs to

Step 4: reconstruct by looking up centroids and combining

December 1, 2025 at 7:13 PM

Isaac Flath

@isaac-flath.bsky.social

Step 2: Cluster each collection of sub-vectors separately to find the centroids

December 1, 2025 at 7:13 PM

Isaac Flath

@isaac-flath.bsky.social

Step 1: Split each embedding in half (make sub-vectors).

December 1, 2025 at 7:13 PM

Isaac Flath

@isaac-flath.bsky.social

However, AI embeddings aren't single numbers; they're vectors (long lists of numbers). This is where Product Quantization (PQ), comes in. It's specifically designed to compress these vectors.

It "refactors" similar embeddings to reduce duplication by using k-means clustering. Let's break it down.

December 1, 2025 at 7:13 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news