Ananth Packkildurai
ananthdurai.bsky.social
Ananth Packkildurai
@ananthdurai.bsky.social
3.5K followers 580 following 240 posts
Editor Data Engineering Weekly; subscribe www.dataengineeringweekly.com. In Prgress, LakeByte
Posts Media Videos Starter Packs
This is the most personal essay that I have written in Data Engineering Weekly. I shared a few key moments in my life and how fortunate I was to meet mentors along my professional journey, which shaped my career.
Thinking Like a Data Engineer
A Journey Beyond Code — Toward Systems, Curiosity, and Confidence
www.dataengineeringweekly.com
🚀 Data Vault vs. Dimensional Modeling vs. Medallion Architecture — When viewed through a modern enterprise data lens, these techniques interlock.

I break down how in Part 2 of my “Revisiting the Medallion Architecture” series.
Revisiting Medallion Architecture: Data Vault in Silver, Dimensional Modeling in Gold
How to Balance Flexibility and Performance in a Modern Data Platform
www.dataengineeringweekly.com
Fivetran and dbt form a strong foundation for modern data infrastructure, known for bringing simplicity to complex engineering workflows. That said, calling it “open” data infrastructure feels like a stretch.
Should we update the definition of an "Analytical Engineer"?
As a data engineer, you can't treat zero-party (consent) and third-party (inferred) data the same way. This distinction is critical for building systems that are scalable, private, and trustworthy.

Here’s my guide:
Engineering Growth: The Data Layers Powering Modern GTM
Building privacy-preserving pipelines that unify zero-, first-, second-, third-, and fourth-party data into a coherent GTM ecosystem.
www.dataengineeringweekly.com
Could be. Composable CDP has not gained significant market share, as identity resolution is a key component that is often proprietary.
With Census already in with Fiveatran and with dbt, it is most likely to evolve as a composable CDP.
Airbnb: Real-Time Key-Value Store

Airbnb’s next-gen key-value store supports real-time ingestion and bulk uploads with sub-second latency, powering feature stores and fraud detection.

Read the full story here: www.dataengineeringw...
Grab: Partner Gateway Metrics at Sub-Second Speed
Real-time partner analytics at scale is tough. Grab uses Apache Pinot, Kafka–Flink ingestion, partitioning, and Star-tree indexing to cut query latency to <300 ms, enabling efficient API monitoring and fast issue resolution.
Netflix Muse: Scaling Analytics at Trillion-Row Scale
Netflix evolved its Muse architecture to handle huge datasets efficiently: HyperLogLog sketches, Hollow in-memory feeds, and Druid optimizations cut query latency by ~50% and reduced concurrency load.
⚡ Latency Every Data Streaming Engineer Should Know

“Real-time” has limits—disk, network, and replication delays add up. StreamNative explains latency tiers, common costs, and tuning levers like batching & async processing.
💡 Must-read for data streaming engineers!
MCP (Model Context Protocol) promises a new way for LLMs to use tools.

Chris Riccomini argues it mostly reinvents OpenAPI, gRPC & CLIs.
Resources = docs
Tools = RPC
Prompts = configs

So… could MCP have just been a JSON file?

💡 More insights: www.dataengineeringw...
How Tables Got Smarter: Iceberg → DuckLake. From static snapshots to stream-native updates and catalog-first metadata, tables are evolving fast. Choose by intent, not hype.

Subscribe → www.dataengineeringw...

Full story → medium.com/fresha-da...
How Tables Grew a Brain: Iceberg → DuckLake
Snapshots → incremental → stream-native → catalog-first.
Metadata is the bottleneck.

More insights → www.dataengineeringw...

Full story → medium.com/fresha-da...
BlaBlaCar scales like a pro!

dbt Core → Transform like a champ

Airflow → Orchestrate effortlessly

CI/CD → Deploy instantly

Dev Containers → Standardized dev

📖 Full story →medium.com/blablacar...

💡 More insights → Subscribe to DEW

#DataEngineering #dbt #Airflow #CICD #DevContainers
🚀 AI adoption is booming—but most data isn’t ready!

AI-ready data is:

Unified

Real-time

Human-verified

Governed

Without it, AI can confidently fail. With it? Reliable, scalable results.

📖 Read More

💡 More insights → Data Engineering Weekly
#AI #AIReady #DataEngineering
Stripe’s Real-Time Billing Analytics ⚡
Content:
Stripe wanted real-time visibility into subscriptions.
Traditional batch systems weren’t fast enough. ⏱️
They built a pipeline using Flink, Spark, and Pinot v2.
Now, analytics arrive in minutes, not hours. Queries return in <300ms. 🚀
The 238th edition of Data Engineering Weekly is available, featuring exciting Data & AI articles.

Read more:
www.dataengineeringw...