Lightnews — Scholar-powered news

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

Data retention in disk/memory is decoupled from whether the message was consumed or not. Unlike certain message queues, this ensures availability is unaffected (can’t run out of disk/memory) from how consumers behave

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

A leader/follower replication model does not require strong distributed consensus between too many nodes to serve traffic.

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

Partitions act as a sharding mechanism. A single partition usually does not push more than a few MB/s.

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

A Kafka cluster scales by horizontally adding more nodes (called brokers).

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

The append-only nature of the log (i.e no updates/deletes) means that reads do not require a quorum to serve fresh data.

There are also no locks between concurrent reads or writes, hence no contention.

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

Kafka heavily leverages the OS file system cache (page cache) to serve reads fast from memory

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

Physical disk writes are heavily batched on the server-side. The fact that Kafka writes linear blocks to disk and fsync is not used allows the OS to batch up data in memory (page cache), coalescing the data into larger IO operations

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

- since the synchronous write path is literally writing to memory. Producers can also be configured to not block on this replication consensus when writing (acks=1 vs acks=all).

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

Writes by default require consensus that every follower replica has acknowledged the write before returning, but Kafka works with fsync off by default and therefore writes to disk asynchronously, without blocking on the physical disk write. This provides near memory-like performance when well-tuned-

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

The broker does not touch the data, its a dumb pipe - messages aren’t even validated for their schema, hence compressed data isn’t decompressed on the server side

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

Compression is usually also done on the client, and these larger batches compress much more favorably. This leads to larger network packets, larger sequential disk operations, contiguous memory blocks

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

Writes are batched on the client-side. Producers batch multiple messages into one "record-batch", hence increasing throughput per request.

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

The system supports two main very simple and performant operations:

- Produce, which appends to the end of the log
- Consume, which reads sequentially starting from any particular offset

November 18, 2025 at 3:38 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

The Android feels choppy/flow and the touch also doesn't feel super responsive

November 11, 2025 at 8:57 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

Daylight is amazing. I'm really looking forward to a v2 (whenever it comes) that's a bit more refined

November 11, 2025 at 6:51 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

I mean, yeah - it's not a technical piece, it's a financial article.

When I hear the Kafka community, I think about engineering talk more than anything else

November 11, 2025 at 6:49 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

With greater competition and commercialization, it seems bound to happen.
I'm of the opinion we're bound to see consolidation in the space soon, because there's too many companies chasing too little of a market: bigdata.2minutestreaming.com/p/event-stre...

November 10, 2025 at 3:34 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

Then I don't see why a relational database can't maintain a pending queue too. Just process the lowest-id job

November 8, 2025 at 10:45 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

What are you referring to?

My intuition is rather that discussion has somewhat died down, in general.

November 8, 2025 at 10:44 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

re: fairness - I admit I'm not as familiar with. Can you teach me how some of these systems ensure fairness? Do they refuse to give messages to performant workers and hold them off for slower ones?

November 8, 2025 at 2:21 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

There is some fundamental misunderstanding here.

pgmq is not based on top of SQS. It only provides API parity with it. No messages actually go to SQS...

Latency-wise, my tests showed single-digit write and read. 99% use cases don't need less.

November 8, 2025 at 2:21 PM

stanislavkozlovski.bsky.social

@stanislavkozlovski.bsky.social

I really don't know. I haven't evaluated it yet.

November 8, 2025 at 2:18 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news