How can Caches be bad?
While publicly praised and often mentioned in system design, there exist instances where caches are bad.
In retrospect, caches have two modes.
- First, the mode where the cache is full with the right data
- Second, the mode, where the cache is empty or full with the wrong data
The first mode is the happy case where the cache operates and stores as intended. However, in the second mode, the cache serves as a bottleneck, causing unexpected traffic to the services. This leads to services being slow or even going down.
There are also cases where the cache transitions from mode one to mode two. When this happens, the system is likely to be stable while also being down. However, the cache is unlikely to recover on its own. Since all this newly added traffic now saturates the network, we can’t get the right data into the cache. So how do we avoid that modality between fast and the state where we are down?
For starters, we can avoid the sharp transition between fast and down by preventing sudden cache misses and controlling the load on the backend. Techniques like staggering cache expirations, coalescing requests, and serving stale data ensure that traffic increases gradually instead of spiking all at once. This keeps the system in a degraded but stable state, allowing the cache to recover instead of causing a full outage.
This is not a rule that applies 100% of the time; prefer no cache where possible. In some cases, it’s better to avoid fragile caching layers and instead use more predictable patterns, such as complete materialized views, local data copies, or backends that can handle load without relying heavily on caching. These approaches reduce the risk of sudden cache failure modes and make system behavior more stable.