Adrian Brudaru
datateam.bsky.social
Adrian Brudaru
@datateam.bsky.social
2.2K followers 1.3K following 120 posts
Data engineer & Cofounder @dlthub. Building out the tooling i wish i had.
Posts Media Videos Starter Packs
After pushing LLMs to their limits, we found a better way.

A hybrid model that grounds AI in verified facts → fewer hallucinations, faster onboarding, and data pipelines that just work.

Read more here 👉
The feature we were afraid to talk about
This is the story of how we made our LLM generation workflow superior to starting from raw docs.
dlthub.com
The real AI win isn't superhuman agents, it's scaled mediocrity.
Doing less with less at massive scale unlocks tasks that were once uneconomical.
The magic is in aggregate value, not perfect outputs. Empower teams with practical AI tools. 
🔗 https://dlthub.com/blog/the-real-ai-win-scaled-mediocrity
This isn't about AI taking jobs. It's about AI exposing what the real job should have been all along.
Building systems of trust. Asking "do I trust this output?" instead of "how do I write this script?"
The intern is here. It's waiting for a manager.
Surviving the AI code Deluge: Data quality in the Spotlight
This is, we’re told, the great democratization of data engineering. The tedious work is gone. The barrier to entry is gone. Everyone can now be a data engineer.
dlthub.com
Then the LLMs arrived.
Suddenly, there's an AI intern who can write that heroic query in seconds. An intern who can build a thousand handcrafted pipelines a day.
It's here to automate the firefighter's job into oblivion.
For years, the most celebrated person on the data team was the one who could write the heroic, last-minute query.
We celebrated firefighters. Craftsmen.
We built a culture around reacting, not architecting. A leverage trap of our own making.
4/5 Whether you're 😤 tired of rebuilding ingestion pipelines, 🔧 looking to integrate Python tools into your data stack or just 💡 curious about modern ELT approaches, this guide has something for you.
3/5 The best part? Anyone who knows Python can build senior-level pipelines.

No new frameworks. No specialized expertise. Just Python doing what it does best.
2/5 What Erfan's guide covers:

⚡ ETL vs ELT & how to avoid data swamps
🐍 Why Python devs need simpler data tools
⚙️ dlt automates schema evolution, incremental loading & data contracts
💻 A practical MySQL-to-DuckDB example you can try today
☁️ Plus real deployment options: Lambda, Airflow, Kubernetes
1/5 Perfect Friday reading: Erfan Hesami's guide to dlt makes data pipelines actually simple.

Our co-founder spent a decade rebuilding the same pipelines. One question changed everything: "What if there was a way to reuse code?"

That became dlt.

🧵👇
The Zero Theorem (2013)

A lonely data worker tries to prove that all his computation adds up to zero. Ten years later, we call it AI productivity.
Scalable Lakehouse Architectures with Iceberg and Polaris!

Simon from Tactile shared insights on tackling bottlenecks in data loading using Apache Iceberg and our open-source dlt library.

youtu.be/gb5fwIO4pX0?...

#databs #iceberg
Scalable Lakehouse Architecture with Iceberg & Polaris: A Battle-tested Playbook
YouTube video by Apache Iceberg
youtu.be
Trump and Putin together are supporting a fascist candidate in the Romanian presidential elections.

Enough said.
I’ve been quiet on this too long. I’m not speaking as a founder here. Just as a person who believes that racism, hate, have no place in a just society.

Trumpism isn’t “just politics.” It’s an ideology that dehumanizes.

I won’t make space for it in my life, my conversations, or my circle.
Vibe code dltHub Jaffle Shop API: using Curl to get responses for pagination.

With this workflow you allow your agent to grab information that isn't available in docs such as response structure

#dataengineering #databs

www.youtube.com/watch?v=fpNZ...
Vibe code dltHub Jaffle Shop API: using Curl to get responses for pagination
YouTube video by dltHub
www.youtube.com
We just dropped a 4h knowledge bomb in collaboration with
@freecodecamp.bsky.social and #DataTalksClub

It's designed for people already in the data field who want to upskill to senior DE knowledge in data loading best practices.

www.youtube.com/watch?v=T23B...

#dataengineering #databs
Data Engineering with Python and AI/LLMs – Data Loading Tutorial
YouTube video by freeCodeCamp.org
www.youtube.com
Loved the energy around Apache Iceberg at the Amsterdam meetup.

At dltHub, we’re focused on making Iceberg actually usable—modernizing legacy stacks with auto-ingestion, schema handling, and pipeline orchestration.

Modernization doesn’t have to hurt.

www.linkedin.com/posts/data-t...
#apacheiceberg #moderndatastack #datapipelines #dlthub | Adrian Brudaru
Violetta’s talk with Lakekeeper showed exactly what we’re focused on at dltHub: making Iceberg actually usable, by modernizing your data stack and automating ingestion, schema evolution, and pipeline ...
www.linkedin.com
1/2 It’s not that LLMs can’t code pipelines. It’s that they read your API docs and said: ‘nah.’

Just dropped a benchmark test on AI-generated EL pipelines using Pipedrive.

TL;DR – bad docs reliably break even the best models.

Got 3 solid examples of where things went wrong.
Vibe coding EL pipelines - an exploration and first success. Inside:

- problem definition
- a first experiment ended surprisingly successful, some learnings enclosed
- I will keep pushing and see how far it's possible to go with more complex cases

dlthub.com/blog/convert...

#databs
From Airbyte YAML to Python with dlt, Cursor and LLMs
In this microblog + video we explore generating python pipelines (dlt REST API) from Airbyte low code yaml spec. Tl;dr: it works well.
dlthub.com