To understand cache coherence, we have to take one step back and look at the memory model first.
What is a Memory Model?
Memory consistency model, or memory model, dictates the order in which memory reads and writes (or loads and stores) get applied to coherence shared memory systems. In the world of computer architecture, memory reads (or loads) permanently change processors’ internal register values, while memory writes (or stores) change data cache values / memory data values.
Unlike single-threaded executions, multiple correct behaviors are usually allowed in multi-threaded programs. Take the following program for example:
| Core 0 | Core 1 | Comments |
| S0: store x = 1; | S1: store y = 1; | Initially, x = 0 & y = 0 |
| L0: load r0 = y; | L1: load r1 = x; |
Assuming both Core 0 and Core 1 never do any instruction reordering, and no memory loads / stores reordering in the interconnection network, r0 and r1 may still not always be 1 at the end of the program. From a global memory order’s perspective, the following orderings are possible:
- S0 -> L0 -> S1 -> L1, resulting in r0 = 0, and r1 = 1
- S0 -> S1 -> L0 -> L1, resulting in r0 = 1, and r1 = 1
- S0 -> S1 -> L1 -> L0, resulting in r0 = 1, and r1 = 1
- S1 -> S0 -> L0 -> L1, resulting in r0 = 1, and r1 = 1
- S1 -> S0 -> L1 -> L0, resulting in r0 = 1, and r1 = 1
- S1 -> L1 -> S0 -> L0, resulting in r0 = 1, and r1 = 0
Therefore, there must be an architectural specification that defines shared memory correctness, i.e., the allowed behavior of multi-threaded programs executing in the shared memory system.
The memory consistency model, which is directly visible to programmers, serves as the architectural specification. It informs programmers how to safely coordinate and share data in a shared memory system.
Why Do We Need Cache Coherence?
Memory consistency models are enforced jointly by processor pipelines (instruction reordering) and cache coherence protocols. Cache coherence is a way to support memory consistency models.
From programmers’ perspective, only the memory consistency models are visible. Cache coherence helps to abstract away the complexity of multiple caches in the system, making them not directly visible to programmers.
Cache coherence protocols seek to make the caches in a shared-memory system as functionally invisible as the caches in a single-core system. It does so by correctly propagating a processor’s write to other processors’ caches.
What Does Cache Coherence Enforce?
Generally speaking, cache coherence enforces two invariants:
- Single-writer-multiple-reader invariant: in each time epoch, for any given memory location, either a single core has read-write access or some number of cores (possibly 0) have read-only access
- Data value invariant: the value of a memory location at the start of an epoch is the same as the value of that memory location at the end of its last read-write epoch
In cache coherence protocol implementations, the single-write invariant is commonly enforced by invalidation: when a core wishes to write a cache line, it initiates a coherence request to invalid the copies in all other caches. Once the copies are invalidated, the requesting core can write to the line without the possibility of another core reading the stale value. If some other core wants to read the line after its copy has been invalidated, it has to initiate a new coherence request to obtain the read permission, and it will obtain a copy from the core which wrote it, thus enforcing coherence.
Note, the two variants above jointly enforce synchronous write propagation across the entire system. There also exists cache coherence protocols supporting asynchronous write propagation, i.e., a write may be visible to some cores, while allowing for staled value for the other cores. However, they are not common topics for CPU interviews, nor common CPU implementation practice.
Reference

Leave a comment