Frank McSherry
banner
frankmcsherry.bsky.social
Frank McSherry
@frankmcsherry.bsky.social
While I think the worst-case optimality is great to read about, it also results in performance that improves on what I've been able to do with conventional binary joins (now at around 11.9s, vs ~13s for the best binary join plan I found).

Yay database theory!
December 4, 2025 at 1:33 AM
And now down to 14.2s (storing some data in 4 rather than 2 bytes to get an optimized implementation to kick in).
November 22, 2025 at 1:50 AM
(( There's other recent work on join robustness, which I think dovetails with WCOJ, but may be technically distinct. I'm thinking of things like arxiv.org/abs/2502.15181 and arxiv.org/abs/2403.01631 ))
Debunking the Myth of Join Ordering: Toward Robust SQL Analytics
Join order optimization is critical in achieving good query performance. Despite decades of research and practice, modern query optimizers could still generate inferior join plans that are orders of m...
arxiv.org
November 21, 2025 at 7:41 PM
Wrt FreeJoin, I think the columnar approach makes it clearer that you can take the old and new behaviors a la carte, without needing custom datastructures and new papers. That's subject to verification from other folks working in the space, but it seems pretty easy to get the best of both worlds.
November 21, 2025 at 7:39 PM
For example, when doing the graph motif tracking in www.vldb.org/pvldb/vol11/..., we found one of the nodes had degree ~45M. Any join implementation that investigates the 45M^2 possible pairs of neighbors would just be dead in the water.
www.vldb.org
November 21, 2025 at 7:35 PM
My experience has been that the column-oriented "bfs generic join" has been plenty practical. If you are dealing with cyclic joins, don't wait to try out at least some flavor of WCOJ, though. When you need them, you really really need them.
November 21, 2025 at 7:34 PM
And many thanks due to @dov.dev and @antiguru.bsky.social, who both asked questions and proposed good ideas.
November 2, 2025 at 8:22 PM
One thing I like is that I stopped being a bad researcher and .. did some reading rather than only writing. There are some papers about columnar Datalog already (links at the end)! However, I don't think I fully understand them yet. More reading required!
October 9, 2025 at 6:33 PM
The closest I've found is github.com/kparc/ksimple which I'll work through, but .. most of the write-ups seem to conflate the esoteria of monadic/dyadic glyphs with the implementation architecture you might use, and .. boy do I want to read about the latter without trudging through the former.
GitHub - kparc/ksimple: k/simple is a bare minimum k interpreter for learning purposes by arthur whitney
k/simple is a bare minimum k interpreter for learning purposes by arthur whitney - kparc/ksimple
github.com
August 29, 2025 at 5:27 PM