Queues all the way down
Every distributed system is just queues pretending to be something else.
The Realization
HTTP request? Queue with a timeout. Database write? Queue with durability. Cache read? Queue with amnesia.
The Problem
We had a "real-time" notification system. Users expected instant delivery. What we had was: API queue → processing queue → delivery queue → device queue. Four queues, each with its own failure mode, each with its own backpressure, each with its own definition of "done."
The Latency
P50 was 340ms. P99 was 47 seconds. The difference? Queue depth. When things were calm, messages flew through. When things got busy, messages waited. And waited.
The Tradeoff
You can have consistency or availability or partition tolerance. You can also have fast or reliable or cheap. Queues let you pretend you can have all of them. They can't.
The Truth
Latency is just the queue depth you haven't measured yet.