Nikhil Benesch
@benesch.bsky.social
1.3K followers 610 following 220 posts
Systems engineer @turbopuffer.bsky.social. Former CTO @materialize.com.
Posts Media Videos Starter Packs
benesch.bsky.social
They’re calling it our most boring feature yet.

boring(n): mundane; works exactly as expected; the highest praise you can give a database
turbopuffer.bsky.social
by popular demand, now introducing one-dimensional vectors 🎈
benesch.bsky.social
tpuf Python client went async today! 🐡 ❤️ 🐍

Async client perf slightly edges out sync client perf under heavy query load.

github.com/turbopuffer/...
benesch.bsky.social
Big day today at turbopuffer HQ.
turbopuffer.bsky.social
turbopuffer is generally available. petabytes in prod.

turbopuffer.com/
Reposted by Nikhil Benesch
turbopuffer.bsky.social
when we said "coming soon" we really meant it

now puffin' in aws-us-east-1 and aws-eu-central-1
benesch.bsky.social
I did! Couldn't pass up the opportunity to work with this team!
benesch.bsky.social
Things move fast at turbopuffer. Now puffin' in aws/us-east-1 and aws/eu-central-1 too.
benesch.bsky.social
I've been dreaming about conditional writes on S3 for years. I couldn't have asked for a better way to celebrate than getting to ship tpuf on AWS. 🐡
turbopuffer.bsky.social
now available: turbopuffer AWS regions ☁️

come find us in us-west-2 and ap-southeast-2
benesch.bsky.social
It’s a rare day that my love of going to battle with build systems pays off like this. Kudos to GCP for a very impressive new SKU. 🐡💨
turbopuffer.bsky.social
GCP released a new ARM machine type with NVMe SSDs recently

70% increase in end-to-end production indexing throughput at 20-33% lower machine cost!
benesch.bsky.social
Couldn't be more excited to be joining the team at @turbopuffer.bsky.social. 🐡💨
turbopuffer.bsky.social
welcoming two excellent additions to team tpuf this week: @arash11gt as a customer engineer, and @nikhilbenesch to the DB team from pasts at Materialize & Cockroach.

we're locking in a p999 eng team to build the best search engine for scale.
benesch.bsky.social
Just catching up on my NULL BITMAPS and this is easily the best intuition for write skew I've ever seen described. The analogy to merge skew in a codebase is genius.
benesch.bsky.social
Well well well: www.crunchydata.com/blog/pg_incr...

Incremental pipelines come to Postgres via Crunchy Data! This is like "dbt incremental", not true incremental view maintenance like @materialize.com or Snowflake's dynamic tables, but it's a neat step towards IVM.
benesch.bsky.social
Of course! Will be very excited to give the new DynamoDB-free SlateDB a spin.
Reposted by Nikhil Benesch
eatonphil.bsky.social
The 6th (and last of 2024!) NYC Systems talks are next Thursday! We've got @jaronoff.com of Omlet and @benesch.bsky.social of Materialize speaking. :)

nycsystems.xyz/december-202...
benesch.bsky.social
Same! I’ve been using Amethyst for over ten years at this point. When I discovered the project in 2013 I never expected that ianyh would still be lovingly maintaining it a decade later. One of my most loved pieces of software, for sure.
benesch.bsky.social
tl;dr the 10x TPS claim actually *is* based on a small but novel optimization inside of S3! Table buckets understand Iceberg naming conventions and adjust S3's rate limiting so that you start with budget for 55k/35k GETs/PUTs per sec rather than the general purpose default of 5.5k/3.5k.
benesch.bsky.social
It just occurred to me that while the 3x query performance claim is based on a comparison to uncomplicated tables, I hadn't actually seen the evidence for the 10x TPS claim.

After some digging I just found the explanation in the re:Invent talk (start at 14:24): youtu.be/1U7yX4HTLCI?...
AWS re:Invent 2024 - [NEW LAUNCH] Store tabular data at scale with Amazon S3 Tables (STG367-NEW)
YouTube video by AWS Events
youtu.be
benesch.bsky.social
Exciting! I'm so here for it. 👀
benesch.bsky.social
Disclaimer: I'm not involved with DuckDB, ClickHouse, or Iceberg, so take this all with a grain of salt.
benesch.bsky.social
Rust might be feasible! In fact I think DuckDB's Delta extension is based on a Rust library. And ClickHouse is starting to integrate some Rust too: clickhouse.com/blog/more-th...

Sadly Rust's Iceberg library is still relatively immature (e.g. no support for writes: github.com/apache/icebe...).
More Than 2x Faster Hashing in ClickHouse Using Rust
Rust’s rich type system and ownership model guarantee memory-safety and thread-safety. There are a fairly large number of useful libraries written on it, so we considered using them in ClickHouse.
clickhouse.com
benesch.bsky.social
Nice, looking forward to it!
benesch.bsky.social
@chris.blue you might be interested in this. Something like this fits in very neatly to your views on the commoditization of the PostgreSQL dialect/protocol (materializedview.io/p/databases-...).
Databases Are Commodities. Now What?
How will vendors differentiate when databases are commoditized? I've got three ideas.
materializedview.io