Mike Driscoll
banner
medriscoll.com
Mike Driscoll
@medriscoll.com
Founder @ RillData.com, building GenBI. Lover of fast, flexible, beautiful data tools. Lapsed computational biologist.
Reposted by Mike Driscoll
In data analytics, we're facing a paradox. AI agents can theoretically analyze anything, but without the right foundations, they're as likely to hallucinate a metric as to calculate it correctly. They can write SQL in seconds, but will it answer the right business question?
Data Modeling for the Agentic Era: Semantics, Speed, and Stewardship
Master the three pillars of agentic data modeling: Metrics SQL for semantics, sub-second analytics for speed, and AI guardrails for trusted insights.
www.ssp.sh
October 17, 2025 at 12:52 PM
DuckLake is a simpler, SQL-friendlier alternative to Iceberg.

“There are no Avro or JSON files. There is no additional catalog server or additional API to integrate with. It’s all just SQL.“

That said, choose your catalog database — a single-point of failure — *very carefully*.
duckdb.org DuckDB @duckdb.org · May 27
Today we're launching DuckLake, an integrated data lake and catalog format powered by SQL. DuckLake unlocks next-generation data warehousing where compute is local, consistency central, and storage scales till infinity. ⁠ducklake is an open standard and we implemented it in the "ducklake" extension.
May 27, 2025 at 9:33 PM
Reposted by Mike Driscoll
Quack... Quack... and code!
@mehdio.com and @medriscoll.com from @rilldata.com are diving into how GenAI is reshaping BI-as-code — from idea to implementation.

This one’s for data folks who want to see beyond the hype.

Register : lu.ma/w4ncmttn
BI-as-Code with GenAI+DuckDB Real Use, Not Just Hype · Luma
Mehdi and Michael dive into how GenAI is reshaping BI-as-code. And as always — it’s not just talk, it’s real code. Get ready for pragmatic insights and…
lu.ma
May 13, 2025 at 7:09 PM
Yo SF Bay Area #databs crew, want to talk lakehouses at a real Lake House? :)

Next week after Data Council, join the founders of @clickhouse.com, @motherduck.com, @startreedata.bsky.social, and @tobikodata.com to talk real-time databases and next-generation ETL.

www.rilldata.com/events/data-...
April 15, 2025 at 11:44 PM
"Shifting left" is the new trend among in data stacks -- but what does it mean and what does it matter?
April 11, 2025 at 9:35 PM
Apache Pinot is one of the world’s fastest and most scalable real-time analytical databases, relied on by LinkedIn, Uber, and Stripe. It was awesome diving into the secrets behind its unique architecture with creator and @startreedata.bsky.social founder Kishore Gopalakrishna.
April 4, 2025 at 8:51 PM
Like others, I jumped on the bandwagon to ridicule the DOGE analyst who "overheated her hard drive" by analyzing just 60k rows of data.

I was wrong.

The truth is even dumber.

🧵
March 14, 2025 at 11:44 PM
Reposted by Mike Driscoll
Just published: Ever had to «Scale beyond Postgres»?

You may have started with a simple ETL pipeline and crunched critical business logic into useful dashboards, but speed and scale didn't grow with data at some point, and it's the concurrent user.

✨ Below are some highlights from the article.
Rill | Scaling Beyond Postgres: How to Choose a Real-Time Analytical Database
This blog explores how real-time databases address critical analytical requirements. We highlight the differences between cloud data warehouses like Snowflake and BigQuery, legacy OLAP databases like ...
www.rilldata.com
March 11, 2025 at 2:24 PM
Reposted by Mike Driscoll
The father of relational databases understood semantic data models.
March 3, 2025 at 9:56 PM
Reposted by Mike Driscoll
Blogged: Exploring UK Environment Agency data with @duckdb.org and @rilldata.com

rmoff.net/2025/02/28/e...

#dataBS
February 28, 2025 at 11:38 AM
Why Pivot Tables Never Die

A brief history of software’s longest-lived and most-loved data tool, from Lotus 1-2-3 and Excel to QlikView and PowerBI, by @ssp.sh

www.rilldata.com/blog/why-piv...
Rill | Why Pivot Tables Never Die
This blog is about how the simplest tools often solve the hardest problems. Simon Spati explores why pivot tables have endured for over decades, how they evolved in the AI era, and why they might be t...
www.rilldata.com
February 4, 2025 at 5:50 AM
DuckCon #6 is now live from Amsterdam (link below!)

In about 20 minutes I'll be sharing some work we've been doing at @rilldata.com on metrics-layer-powered dashboards. And speculating in my last slide, because why not, about what you would name an AI agent that runs on DuckDB... 🦆
January 31, 2025 at 2:41 PM
“Pivot tables are just a shorter, fatter GROUP BY”.

❤️ this by @gregat.es
will just leave this here in case you're interested gregat.es/pivot-shorte...
gregat.es
gregat.es
January 29, 2025 at 9:57 AM
Reposted by Mike Driscoll
DuckCon #6 will start in 168 hours (January 31, 15:00)! If you plan to attend in-person, please register at duckdb.org/2025/01/31/d...

The stream will be available without registration.
January 24, 2025 at 2:01 PM
#databs welcomes the creator of Pandas, founder of Datapad & Voltron, and all around nice guy Wes McKinney @wesmckinney.com to BlueSky.
January 23, 2025 at 1:14 AM
I'm collaborating with @ssp.sh on a brief history of pivot tables.

We'll be tracing their lineage and evolution across Visicalc, Lotus, Excel, PowerPivot, Qlik, and PowerBI.

Any sites, videos, products (dead or alive) that you would recommend we should mention or dig into?
January 16, 2025 at 7:53 PM
Real-time user experiences make applications magical.

Google, WhatsApp, and ChatGPT would all fail if you added just 10 seconds to every search, message, or prompt interaction.

We are surrounded by fast user experiences and yet most of us tolerate business dashboards that take minutes to load.
"If we think about the core offering of Rill, we focus on combining large scale data transformation, a very fast OLAP data engine, and ultimately delivering value through an interactive, *fast*, flexible dashboard. That is really the face of Rill and that was historically the face of Metamarkets.”
January 16, 2025 at 6:48 PM
Reposted by Mike Driscoll
It took 5 minutes to create this dashboard with @rilldata.com's AI auto-generate features. Impressive @medriscoll.com!
December 15, 2024 at 10:17 PM
Reposted by Mike Driscoll
👋 @davistreybig.bsky.social Welcome! All, Davis is a VC at innovationendeavors.com and someone that shares my vision around object storage. We also co-invested in responsive.dev together! 😁 Here's one of his posts. You should follow him.
S3 as the universal infrastructure backend
Why BLOB stores are becoming the default storage layer for cloud services
medium.com
December 12, 2024 at 6:09 PM
I’ve often said working in data engineering is like working in the post office. Queues get backed up, deliveries are lost, yet the stream never stops. It’s a thankless, mostly invisible job until something goes wrong, & then it’s complaints.

I hoped the analogies stopped there, but sadly maybe not.
December 9, 2024 at 8:33 PM
Reposted by Mike Driscoll
I had the pleasure of talking with @medriscoll.com on his podcast. I love this long-form discussion, especially on a snowy day like this :)

Talked about:
> My journey (DE ↠ Author)
> Bluesky🦋, DuckDB, S3, data modeling & declarative stacks
> What do you call data people?

🎙️ youtu.be/xW_HGb46xMk
December 6, 2024 at 11:15 AM
Snowflake’s pricing is a riddle wrapped in a mystery inside an enigma.
Also their pricing page is a huge PDF file which makes it pretty hard to understand/estimate the cost. I hated it so much that I spent a day turning it into a web app: snowflake-cost-calculator.universql.com
Snowflake Cost Calculator
Compare different clouds, services, warehouse sizes, and container services to estimate your Snowflake costs.
snowflake-cost-calculator.universql.com
December 6, 2024 at 2:32 AM
Nice review of the latest CSV parsing & querying capabilities of @duckdb.org.

Parquet is obviously better than CSV because of its stronger types, efficient compression, and support for predicate pushdown.

But given CSV is *everywhere*, these improvements are welcome.

duckdb.org/2024/12/05/c...
CSV Files: Dethroning Parquet as the Ultimate Storage File Format — or Not?
Data analytics primarily uses two types of storage format files: human-readable text files like CSV and performance-driven binary files like Parquet. This blog post compares these two formats in an ul...
duckdb.org
December 5, 2024 at 8:06 PM
@sfchronicle.com maybe you could find it in your heart to remove the paywall on the TSUNAMI EVACUATION MAP?

www.sfchronicle.com/bayarea/arti...
December 5, 2024 at 7:42 PM
OK if this stochastic parrot's answer is to be trusted, I'll be bracing for the tsunami's impact around 12:15-12:30pm.
December 5, 2024 at 7:24 PM