Just like snooping based protocols, directory based cache coherence protocol has to handle non-atomic requests in real world implementations. We start from the a directory based MSI base model, and discuss one solution on non-atomic requests handling.
Base Model
In directory based MSI protocol, there are 3 message types:
- Request messages, including GetS, GetM, PutM and PutS
- Forwarded request messages, including Fwd-GetS, Fwd-GetM, Inv (Invalidation), and Put-Ack
- Response messages, including Data and Inv-Ack (Invalidation Acknowledgement)
Given the following assumptions:
- All caches implement write-back + write-allocate policy
- Separate networks for each message type for deadlock prevention
- The interconnection network / fabric enforces point-to-point ordering for forwarded request messages, i.e., if a directory sends two forwarded request messages to a cache controller, the messages arrive at that cache controller in order
- A complete directory accurately tracks the status and sharers of each cache line
- A cache controller will not receive forwarded request messages for cache lines, which have outstanding coherence transactions; the directory controller will not receive request messages for cache lines, which have outstanding data responses
The cache controller state transitions for directory based MSI protocol is shown below:
| State | Processor Core Events | Forwarded Request Messages | Response Messages | |||||||
| Load | Store | Eviction | Fwd-GetS | Fwd-GetM | Inv | Put-Ack | Data from Dir | Data from Owner | Inv-Ack | |
| I | Issue GetS | Issue GetM | – | – | – | – | – | Data[ack=0] / S (GetS) or M (GetM) Data[ack>0] / – | Data[ack=0] / S (GetS) or M (GetM) | if (last Inv-Ack) – / M else ack– |
| S | Load hit | Issue GetM | Issue PutS | – | – | Send Inv-Ack to Req / I | – / I | Data[ack=0] / M Data[ack>0] / – | – | if (last Inv-Ack) – / M else ack– |
| M | Load hit | Store hit | Issue PutM, send data to Dir | Send Data[ack=0] to Req & Dir / S | Send Data[ack=0] to Req / I | – | – / I | – | – | – |
When a cache controller issues “GetM” for a line in State I, the data response could be from either the directory or another owner cache. If the data response is from directory, the requesting cache could expect multiple Inv-Ack messages from other caches, and it cannot modify the cache line until it receives all Inv-Ack messages; If the data response is from another owner cache, the requesting cache does not expect any Inv-Ack messages from other caches.
When a cache line in State S gets evicted, it cannot be evicted silently. Instead, the cache controller must issue a “PutS” request to the directory, letting the directory update its sharer list.
When a cache line in State S or M gets evicted, the cache controller issues “PutS” or “PutM” requests to the directory, and the directory will respond with “Put-Ack” messages.
By denoting directory states from caches’ perspective, the directory state transitions for the protocol is shown accordingly:
| State | Request Messages | Response Messages | |||
| GetS | GetM | PutS | PutM + Data from Owner | Data | |
| I | Send data to Req, add Req to sharer / S | Send data to Req, set Owner as Req / M | – | – | – |
| S | Send data to Req, add Req to sharer / S | Send data to req, send Inv to sharers, clear sharers, set Owner to Req / M | Remove Req from sharers, send Put-Ack to Req / S (not the last PutS) or I (the last PutS) | – | – |
| M | Send Fwd-GetS to Owner, add Req and Owner to sharer, clear Owner | Send Fwd-GetM to Owner, set Owner to Req | – | Copy data to memory, clear Owner, send Put-Ack to Req / I | Copy data to memory / S |
When the directory gets a “GetS” request for a line in State M, it sends the “Fwd-GetS” request to the owner cache, and expects a data write back. Once it gets the data response from the previous owner cache, it copies the data to memory and transitions the directory state to State S.
Handling Non-Atomic Requests (By Stalls)
In directory based MSI protocol, a cache controller may receive forwarded request messages for a cache line, which has a coherence transaction outstanding. In addition, the directory controller may receive request messages for a cache line, which has a data response outstanding.
Handling such cases requires adding more transient states in the directory based MSI protocol. In addition, to simplify the transient states handling, we stall the “conflicting” forwarded request messages to cache controllers and the “conflicting” request messages to the directory.
Given the following assumptions:
- All caches implement write-back + write-allocate policy
- Separate networks for each message type for deadlock prevention
- The interconnection network / fabric enforces point-to-point ordering for forwarded request messages, i.e., if a directory sends two forwarded request messages to a cache controller, the messages arrive at that cache controller in order
- A complete directory accurately tracks the status and sharers of each cache line
The cache controller state transitions for directory based MSI protocol is shown below:
| State | Processor Core Events | Forwarded Request Messages | Response Messages | |||||||
| Load | Store | Eviction | Fwd-GetS | Fwd-GetM | Inv | Put-Ack | Data from Dir | Data from Owner | Inv-Ack | |
| I | Issue GetS / IS-D | Issue GetM / IM-AD | – | – | – | – | – | – | – | – |
| IS-D | Stall | Stall | Stall | – | – | Stall | – | Data[ack=0] / S | Data[ack=0] / S | – |
| IM-AD | Stall | Stall | Stall | Stall | Stall | – | – | Data[ack=0] / M Data[ack>0] / IM-A | Data[ack=0] / M | ack– |
| IM-A | Stall | Stall | Stall | Stall | Stall | – | – | – | – | if (last Inv-Ack) – / M else ack– |
| S | Load hit | Issue GetM / SM-AD | Issue PutS / SI-A | – | – | Send Inv-Ack to Req / I | – | – | – | – |
| SM-AD | Hit | Stall | Stall | Stall | Stall | Send Inv-Ack to Req / IM-AD | – | Data[ack=0] / M Data[ack>0] / SM-A | – | ack– |
| SM-A | Hit | Stall | Stall | Stall | Stall | – | – | – | – | if (last Inv-Ack) – / M else ack– |
| M | Load hit | Store hit | Issue PutM, send data to Dir / MI-A | Send Data[ack=0] to Req & Dir / S | Send Data[ack=0] to Req / I | – | – | – | – | – |
| MI-A | Stall | Stall | Stall | Send Data[ack=0] to Req & Dir / SI-A | Send Data[ack=0] to Req / II-A | – | – / I | – | – | – |
| SI-A | Stall | Stall | Stall | – | – | Send Inv-Ack to Req / II-A | – / I | – | – | – |
| II-A | Stall | Stall | Stall | – | – | – | – / I | – | – | – |
In the table above, transient states are represented in the form of “XY-AD”, where “A” stands for acknowledgement outstanding, and “D” stands for data response outstanding.
When a cache controller processes a load miss, the cache line state first transitions to “IS-D”. In the meantime, the cache controller may receive an Inv message for the same line (because data responses and Inv messages travel on separate networks). The cache controller simply stalls the Inv message until the data response returns and the cache line state eventually transitions to State S.
Similarly, when a cache controller processes a store miss, the cache line state first transitions to either “IM-AD” or “SM-AD” to wait for data responses, then possibly transitions to either “IM-A” or “SM-A” to wait for any expected Inv messages. In the meantime, the cache controller may receive “Fwd-GetS” or “Fwd-GetM” requests for the same cache line. The cache controller simply stalls these forwarded requests until the cache line eventually reaches the stable State M.
In particular, when a cache line is in State “SM-AD” (logically the same as State S), it must respond to Inv messages. The cache controller transitions the cache line to State “IM-AD”, to prevent further load hits.
When a cache line in State S gets evicted, it cannot be evicted silently. Instead, the cache controller must issue a “PutS” request to the directory, and transition the cache line state to “SI-A” first. Before the cache controller receives the “Put-Ack” from the directory, the cache line state is logically the same as State S. If the cache controller receives an Inv message for the line, it needs to respond and transition the cache line state to “II-A”. State “II-A” is logically the same as State I, but it denotes that the cache controller is still waiting for a “Put-Ack” to complete the eviction.
Similarly, when a cache line in State M gets evicted, the cache controller issues a “PutM” request to the directory, and transitions the cache line state to “MI-A” first. Before the cache controller receives the “Put-Ack” from the directory, the cache line state is logically the same as State M. If the cache controller receives “Fwd-GetS” or “Fwd-GetM” requests, it needs to respond and transition the cache line state to “SI-A” or “II-A”. These states are logically the same as State S or I, but it denotes that the cache controller is still waiting for a “Put-Ack” to complete the eviction.
By denoting directory states from caches’ perspective, the directory state transitions for the protocol is shown accordingly:
| State | Request Messages | Response Messages | ||||
| GetS | GetM | PutS | PutM + Data from Owner | PutM + Data from Non-Owner | Data | |
| I | Send data to Req, add Req to sharer / S | Send data to Req, set Owner as Req / M | Send Put-Ack to Req | – | Send Put-Ack to Req | – |
| S | Send data to Req, add Req to sharer / S | Send data to req, send Inv to sharers, clear sharers, set Owner to Req / M | Remove Req from sharers, send Put-Ack to Req / S (not the last PutS) or I (the last PutS) | – | Remove Req from sharers, send Put-Ack to Req | – |
| M | Send Fwd-GetS to Owner, add Req and Owner to sharer, clear Owner / S-D | Send Fwd-GetM to Owner, set Owner to Req | Send Put-Ack to Req | Copy data to memory, clear Owner, send Put-Ack to Req / I | Send Put-Ack to Req | – |
| S-D | Stall | Stall | Remove Req from sharers, send Put-Ack | – | Remove Req from sharers, send Put-Ack | Copy data to memory / S |
At first glance, it might be confusing to see “PutM + Data from Non-Owner” in directory State S. Say Core 0 evicts a line in State M and issues a “PutM” request to the directory. Before the “PutM” request from Core 0 reaches the directory, the directory may have already served “GetS” from Core 1 by sending “Fwd-GetS” to Core 0, and the directory may have already changed to State S. This is possible since all links are point-to-point, and there are no ordering requirements among different message classes.
Similarly, it is possible to see “PutS” and “PutM + Data from Non-Owner” in directory State M. Say Core 0 evicts a line in State S or M, and issues a “PutS” or “PutM” request to the directory. Before the “PutS” / “PutM” from Core 0 reaches the directory, the directory may have already served “GetM” from Core 1 by sending “Fwd-GetM” to Core 0, and the directory has already changed to State M.
Taking one step further, Core 0 evicts a line in State S or M, and issues “PutS” or “PutM” requests to the directory. Before the “PutS” / “PutM” from Core 0 reaches the directory, the directory may have already served “GetS” / “GetM” (by sending “Fwd-GetS” / “Fwd-GetM” to Core 0) followed by “PutS” / “PutM” from Core 1. When the directory is in State I, it is possible to see “PutS” and “PutM + Data from Non-Owner” as well.
Again, all above scenarios we discussed can only happen when allowing a cache controller to receive a forwarded request message for a cache line, which also has a coherence transaction outstanding.
Reference

Leave a comment