John Kutay
@jkxosound.com
2.1K followers 350 following 290 posts
Product & Engineering @ Striim (streaming sql, change data capture), creator/host @ What's New in Data pod, music @ jkxo
Posts Media Videos Starter Packs
Pinned
jkxosound.com
"If I could also get added to the PR before you refactor the schema design that would be great"
Reposted by John Kutay
clare.dev
Fun conversation with @jkxosound.com about where I think things are going in the AI agents space, Strands Agents, and rats 🐀
jkxosound.com
@clare.dev is one of the leaders making AI engineering simple and scalable at AWS. Had a great time chatting with her as we discussed Strands Agents and patterns like “Retrieval as a Tool”.
jkxosound.com
@clare.dev is one of the leaders making AI engineering simple and scalable at AWS. Had a great time chatting with her as we discussed Strands Agents and patterns like “Retrieval as a Tool”.
jkxosound.com
Really appreciate the depth at which the Hex team broke down their Text-to-SQL implementation. Everyone's trying to teach LLMs SQL like it's a training problem but it's really a graph traversal problem.
jkxosound.com
In my post about #DuckDB I digress into the role of the database buffer cache to discuss how we segregate transactional workloads from analytical. DuckDB turned out to be a natural, lightweight approach to offloading analytical queries ensuring our application upheld performance requirements.
jkxosound.com
Instead of materialized views, we built in-process DuckDB caching in the control plane of Striim Developer — improving query performance 5–10x with zero added infra.

PostgreSQL for OLTP, DuckDB for Operational OLAP. But I won't call it HTAP 🤐

medium.com/striim/beyon...
Beyond Materialized Views: Using DuckDB for In-Process Columnar Caching
In this post we will talk about using DuckDB as the operational analytics store for the control plane of Striim Developer — a serverless…
medium.com
jkxosound.com
@marcbrooker.bsky.social breaks down how they've architected a fully ACID-compliant database service that combines simple, serverless management with high availability and massive scale on AWS Aurora DSQL.

youtube.com/shorts/dScUi...
Distributed PostgreSQL with Aurora DSQL
YouTube video by Striim
youtube.com
jkxosound.com
Thanks Marc! Super fun to learn how you combined the best parts of PostgreSQL and your own distributed processing engine.
jkxosound.com
This was actually my longest podcast ever at over 70 minutes. Not sure I could have made it any shorter because nerding out on databases with Andy Pavlo was too fun.
jkxosound.com
Was super fun chatting with @andypavlo.bsky.social
to kick off the new season of What's New in Data. We dive into vector databases, text to sql, trends in data infrastructure, and Andy's awesome (and open) database course.

youtube.com/shorts/tjLmx...
Andy Pavlo on Vector Databases
YouTube video by Striim
youtube.com
jkxosound.com
Was super fun chatting with @andypavlo.bsky.social
to kick off the new season of What's New in Data. We dive into vector databases, text to sql, trends in data infrastructure, and Andy's awesome (and open) database course.

youtube.com/shorts/tjLmx...
Andy Pavlo on Vector Databases
YouTube video by Striim
youtube.com
jkxosound.com
A side effect of LLMs: I'm taking on way more than I ever have in my life. I don't know if this is more productive or diluting myself. tbd!
jkxosound.com
Just found out one of the internal b2b CRUD app vendors is more like CRD because it doesn't support updating submissions. AI gonna cook that sector so hard.
jkxosound.com
and that’s why I’m working on a Saturday morning 🫠
jkxosound.com
Your adversaries are taking (not my) Presidents Day off. Time to ship. 🚀
jkxosound.com
I’ll never forget where I was the day I learned oats could be milked.
jkxosound.com
Them: Wait so you're saying I don't need to deploy Kafka?
Me: No
Them: Kinesis?
Me: No
Them: Zookeeper? YARN?
Me: No
Them: Will you write every record to disk and replicate it?
Me: No

Unfortunately the bar of complexity for streaming has been set so high. I'm calling it Streamholm Syndrome.
jkxosound.com
I'm not sure whether to be more amazed at the hate for FiveTran's price increase or the fact that Reddit doesn't know Striim exists and are proposing batch solutions to this persons obvious streaming CDC use case.

www.reddit.com/r/dataengine...
I am trying to escape the Fivetran price increase
www.reddit.com
jkxosound.com
They really gave the smell of rain an epic name: petrichor. They really did that.
jkxosound.com
If you don't have that type of scale, but simply want a reliable, real-time streaming service, you can use Striim Developer for free 🤘
signup-developer.striim.com
Sign Up - Striim Developer
signup-developer.striim.com
jkxosound.com
A single Striim cluster (multi-node for scalability and fault tolernace) can handle 35k, very wide, very active databases that produce millions of DML per hour hour and dozens of DDL per day. The 'intelligence' layer or Striim was able to apply rule based logic on how to handle complex DDL.
jkxosound.com
I will die on this hill but MySQL's 'Alter Table Add Column AFTER' DDL is pointless. It doesn't change the layout on disk. If you care about order of the columns, that's purely a read side construct and you should address it in your query not your DDL!
jkxosound.com
We’ve shifted embedding generation and transformers left into the streaming layer to support near real-time RAG. Take a read if you want to hear the optimizations we made for change data capture and incremental embedding generation.

www.striim.com/blog/real-ti...
Real-Time RAG: Streaming Vector Embeddings and Low-Latency AI Search
Imagine searching for products on an online store by simply typing “best eco-friendly toys for toddlers under $50” and getting instant, accurate results—while the inventory is synchronized seamlessly ...
www.striim.com