Lightnews — Scholar-powered news

Dmitriy Ryaboy

@squarecog.bsky.social

This is a pretty intriguing idea for future proofing file formats.
It does assume wasm is future proof, of course, but that feels like a safer bet than "assume readers are updated"

Andy Pavlo @andypavlo.bsky.social · Oct 1

Our F3 files embed small WASM programs to decode data. If somebody creates a new encoding and the DBMS does not have native impl, it can still read data using WASM passing Arrow buffers. Our experiments show WASM is 15-20% slower than native. We use @spiraldb.com's Vortex encoding impls.

Overview of F3's decoding pipeline with WASM support.

October 1, 2025 at 3:23 PM

Dmitriy Ryaboy

@squarecog.bsky.social

If you love this sort of thing, read up on C-store, which introduced this idea in 2005 and commercialized it in Vertica. Stonebraker, Sam Madden, Daniel Abadi.
Parquet was also partially inspired by Vertica (and Google's Dremel, and PaX by Natassa Ailamaki et al) :-).

DuckDB @duckdb.org · Sep 29

Are you streaming into your Lakehouse?

Traditional formats suffered with the “many small files” problem — OLAP engines merge them reactively with long jobs. ⏳

DuckLake takes a proactive path: Data Inlining + async flush to parquet while always keeping data queryable ⚡

September 30, 2025 at 3:04 PM

Dmitriy Ryaboy

@squarecog.bsky.social

ML is just applied stats.
Stats is just applied algebra.
LLM is just ML backward and with an extra L.

September 22, 2025 at 10:50 PM

Dmitriy Ryaboy

@squarecog.bsky.social

The obvious reaction here is to shift at least some of the hiring out of the country to get access to the talent. The obvious counter reaction is to tax payments and wages to foreign employees and contractors. Which will also provoke a reaction. And none of this makes the US stronger or smarter.

September 20, 2025 at 9:17 PM

Dmitriy Ryaboy

@squarecog.bsky.social

About a decade late with this, but:
Someone should have started a social media ad agency called Twaddle.

September 20, 2025 at 6:25 PM

Dmitriy Ryaboy

@squarecog.bsky.social

It's tempting to take shortcuts that give you speed today by mortgaging speed tomorrow.

Trouble is, today is yesterday's tomorrow.

September 18, 2025 at 11:07 PM

Dmitriy Ryaboy

@squarecog.bsky.social

I tried 2 different english to insights sql llm agents from reputable vendors in the past week. Data analyst jobs remain safe.
Firmly in the toy category for now.

September 17, 2025 at 12:25 AM

Dmitriy Ryaboy

@squarecog.bsky.social

Happened to be by the Cloudera building in the south bay earlier. Checked LinkedIn and discovered I have literally 0 1st degree connections who work there now. Not unexpected, I guess, but, man... betwen hnwx and cldr I used to know like 100s of folks there

September 16, 2025 at 5:14 AM

Dmitriy Ryaboy

@squarecog.bsky.social

Heck of a vote of confidence from LinkedIn for Flyte: www.linkedin.com/blog/enginee...

OpenConnect: LinkedIn's Next-Generation AI Pipeline Ecosystem

www.linkedin.com

September 15, 2025 at 3:02 AM

Dmitriy Ryaboy

@squarecog.bsky.social

About once every two years I have cause to re-learn a very important data lesson: never, ever, trust analysis based on ratio metrics.

September 10, 2025 at 9:09 PM

Dmitriy Ryaboy

@squarecog.bsky.social

Trying and failing to make the page edits look right? Is offline access lackluster? Tired of AI upsell as a replacement for poor search quality?

You might be suffering from Notion sickness.

August 30, 2025 at 4:13 AM

Dmitriy Ryaboy

@squarecog.bsky.social

Phone book, noun: an ebook you read on your phone.

August 26, 2025 at 1:11 AM

Dmitriy Ryaboy

@squarecog.bsky.social

Looking up latin phrases on Google results in an AI response in French a good % of the time. Fortunately, my French is slightly better than my latin.

August 14, 2025 at 3:15 PM

Dmitriy Ryaboy

@squarecog.bsky.social

The scariest thing about dinosaurs is that they were huge, absolutely dominant, *and humans had nothing to do with them dying out*

August 13, 2025 at 1:03 AM

Dmitriy Ryaboy

@squarecog.bsky.social

Andor is a very boolean show.

July 19, 2025 at 5:12 AM

Dmitriy Ryaboy

@squarecog.bsky.social

Just 45 minutes north of SF, and AI means something completely different on a dairy farm in Sonoma, in the context of breeding livestock. I seriously wasn't tracking for a few minutes there.

July 10, 2025 at 12:29 AM

Dmitriy Ryaboy

@squarecog.bsky.social

A Random Walk In Question Space:

A particularly ineffective, but popular, investigation method commonly found when the investigator does not have direct access to the data, and does not directly incur the cost of asking for irrelevant analysis that does not get them any closer to useful answers.

June 10, 2025 at 4:11 PM

Dmitriy Ryaboy

@squarecog.bsky.social

Still thinking about it. Serious "retired samurai" energy.

Man's just standing there, pouring beers, drying glasses, listening to young coders in this city of startups go through their ups and downs. Dispensing wisdom about code and life.

Dmitriy Ryaboy @squarecog.bsky.social · Jun 4

Just saw a LinkedIn profile that is a real career to aspire to:
* CS Professor
* Google Director of Eng
* Distinguished Eng at Microsoft
* Brewer and bar owner.

June 5, 2025 at 3:55 PM

Dmitriy Ryaboy

@squarecog.bsky.social

A friend once said that experienced software folks get a weird superpower: they can diagnose and even predict bugs / seams in systems they've never seen the code for, just from knowing how stuff is put together.
It's a thrill every time one pulls this off successfully :).

June 4, 2025 at 7:54 PM

Dmitriy Ryaboy

@squarecog.bsky.social

Just saw a LinkedIn profile that is a real career to aspire to:
* CS Professor
* Google Director of Eng
* Distinguished Eng at Microsoft
* Brewer and bar owner.

June 4, 2025 at 7:42 PM

Dmitriy Ryaboy

@squarecog.bsky.social

DuckLake makes tons of sense, as all things DuckDB does. How long till they implement CMETA style optimizations right in the query planner? vldb.org/pvldb/vol14/...

vldb.org

June 3, 2025 at 12:35 AM

Dmitriy Ryaboy

@squarecog.bsky.social

I am just a boy
Standing in front of an agent
Asking it to try again and think harder.

May 29, 2025 at 10:59 PM

Dmitriy Ryaboy

@squarecog.bsky.social

Sure, an english to sql agent on your db won't give you an answer as accurate as a data analyst would, but it will give you incorrect answers so much faster!

May 29, 2025 at 3:51 PM

Dmitriy Ryaboy

@squarecog.bsky.social

Seeing com.google.hadoop is still a mind bender.
Don't think any of us saw this coming back in the day... or at least I didn't.

May 28, 2025 at 9:01 PM

Dmitriy Ryaboy

@squarecog.bsky.social

Surprising how much people talk about Iceberg providing "separation of data and compute" given that it, you know, doesn't.

May 28, 2025 at 6:10 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news