David Tippett
banner
taidesu.bsky.social
David Tippett
@taidesu.bsky.social
Open Source enthusiast and builder of search systems. Currently building search at GitHub. Former DA for open-source OpenSearch @AWS. Opinions are my own.
🧵 ANNDDDDDDD MY SSD's WONT MOUNT because the Kingston NV2 wont train on the CM5 and buying new drives is not ideal because storage cost is up 3-4x because of the DRAM shortages.

Thank you for coming to my Ted talk. 🫡 See you in 10 years when I finally finish this project.
February 13, 2026 at 4:03 PM
🧵 CM5's are ARMv8 so a few of them and I can do it! Of course none were available because of the initial shortage.

Finally get the CM5s and then I have to wait for the availability of heat sinks. Wait for a while and finally get a community made version. Now to install the CM5's FINALLY 🚀
February 13, 2026 at 4:03 PM
🧵 No worries, others are posting OpenSearch containers right?

Get it running for a bit using the bitnami/opensearch containers and BOOM - Bitnami pulls their containers and moves to a paid model -_- (I know I could make my own but I don't want to maintain that infra 😭)
February 13, 2026 at 4:03 PM
Time to fight fire with fire. Even worse AI generated PR reviews.
February 9, 2026 at 5:10 PM
The right way to think about it is to say that Refresh writes segments to disk. Translog "stamps" them as being official (Lucene commit).

If the Translog reaches it's flush size it'll trigger a refresh before committing the segment.

They're related but not as an either or but a both and
February 6, 2026 at 3:31 PM
Okay maybe just 6.0

101blockchains.com/web-5-0/

Who knows what happened to 3 & 4
What is Web 5.0 - Explained
Did you hear about the Web 5.0, the most recent transformative version of the web? Let’s dive in to learn about the Web 5.0 explained here in detail. The
101blockchains.com
January 20, 2026 at 12:31 PM
I’m pretty sure we’re on like web 7.0 according to a bunch of random crypt bros 😆
January 20, 2026 at 12:30 PM
So it’s funny the bus thread was right above this one in my feed but the “I didn’t get enough likes” I’ve not yet seen… 🔗 👀
January 20, 2026 at 12:28 PM
Together though we nailed the problem down in probably 1/10th of the time it would’ve taken before.
January 9, 2026 at 4:12 AM
On top of that because we’d elected for a LOT of smaller shards (small relative to GitHubs size mind you) we were ending up with a lot of really small segments which couldn’t be efficiently compacted because of concurrent merge limits.

Copilot was never brining me that far on its own.
January 9, 2026 at 4:12 AM
After a bit of troubleshooting I realized what was happening. Every time segments merge in ES the vector graph has to be rebuilt. I knew that.

What I hadn't thought through is that because BBQ uses a centroid every segment merge we had to re-quantize all the documents.
January 9, 2026 at 4:12 AM
We were seeing ingestion backing up exponentially. I had an intuition that it was a bottleneck with segment merges.

I didn't have to remember which endpoints show merge stats. I just had copilot shoot out a profiling script and ran it on an interval.
January 9, 2026 at 4:12 AM
For example, we just came across an issue with indexing vectors into Elasticsearch. ChatGPT/Copilot/Claude were all stumped.

Thankfully I (mostly) know what I was doing and was able to guide them through the issue.

Deep expertise is still needed but is going to be harder and harder to come by.
January 9, 2026 at 4:02 AM