Andras Gerlits
omniledger.io
Andras Gerlits
@omniledger.io
I have built the first async, consistent data-platform
https://omniledger.io/

I build distributed systems @Citi

I also write about distributed systems
https://medium.com/@andrasgerlits
There are two options here: we either end up showing that we're actually wrong and all our theories are for the birds or that cloud providers and their overpriced services are obsolete
November 29, 2025 at 9:44 AM
In the next thread, I'll explain how to build such a clock from many, smaller ones, how such a system can be built on any hardware, and how it can be much more resilient than its alternatives.
November 20, 2025 at 5:52 AM
So, what can we do? We can establish a different kind of clock, one which allows for different places to move at their own speeds. In physics, this is called "proper time". We also need a way to reach consensus about the outcomes of events, ordered by these clocks.
November 20, 2025 at 5:52 AM
When event A happens at a distance of 100ms from event B, they cannot be reconciled in under 400ms. Now, if event C depends on either A or B, it needs to wait 400ms before it can even start processing. You can see how this quickly gets out hand.
November 20, 2025 at 5:52 AM
When Spanner attempts to "flatten out" time by playing on our perception of time-order, what it's actually doing is trying to establish some absolute, shared time through cause and effect, using communication. These time-effects are transitive, so reverberate across the system.
November 20, 2025 at 5:52 AM
Since locks are also records, which have an impact (are in a causal relationship with) on other processes throughout the system, Spanner's clocks are made up of both its atomic clock and its communication-mechanism serving its locking, as those are "things that establish order"
November 20, 2025 at 5:52 AM
People usually understand Spanner's "clocks" to mean its physical, atomic, "wall-clocks", but they only provide a subset of all the operations expected of a clock in a distributed system. A clock is supposed to order all changes in the system, but in Spanner, it doesn't.
November 20, 2025 at 5:52 AM
Spanner isn't fair. Local transactions can grab locks much more quickly than remote ones, so will have an opportunity to write records more often.

So how does this cause our scaling problems? By making all changes "local" at all places, papering over the fact that they aren't.
November 20, 2025 at 5:52 AM
So what are the limits of Spanner vs the limits of distributed consistency? Even the simplest pessimistic transaction will take at least 2 round-trips to complete, so 400 ms instead of a 100.

Even worse: we're comparing apples to oranges. The 100ms was for "fair writes".
November 20, 2025 at 5:52 AM
Locks obfuscate this process, by simulating a world where clocks are frozen until some system-wide consensus is reached about the next version of a locked record, at the end of some collection of write-events.

This abstraction in clocks is what causes our scaling problems.
November 20, 2025 at 5:52 AM
According to our model, Spanner (or any other pessimistic system) creates a different event each time it's requesting a lock over a remote resource. The local process emitting the event blocks until the remote node creates a counter-event to acknowledge the lock.
November 20, 2025 at 5:52 AM
Think about what a Compare-and-Swap operation is and you'll find the definition of optimistic conflict-resolution. Our intent to update the value of a given record is conditional on that record currently having the given value. Locks are polite signals to others. Conventions.
November 20, 2025 at 5:52 AM
This leads us to the question of conflict resolution. If we want to arbitrate between such races, how can we do that without messing up our history, potentially affecting past events? That will be the next thread.
November 19, 2025 at 5:18 AM
We can potentially say that our local site has priority to decide the order, so something like a master-slave setup between these clocks. In this case, the "slave" clock can propose changes to the events, but can never be sure that these will be accepted.
November 19, 2025 at 5:18 AM
Since nodes can generate a lot of data in 100 ms, this means that the further away the other nodes are, the less often we can allow our readers to read new data.

There is one caveat: this is only this categorically true if we want to give fair access to each writer
November 19, 2025 at 5:18 AM