Handling non-atomic requests in directory based MSI protocol (I)

Just like snooping based protocols, directory based cache coherence protocol has to handle non-atomic requests in real world implementations. We start from the a directory based MSI base model, and discuss one solution on non-atomic requests handling.

Base Model

In directory based MSI protocol, there are 3 message types:

  1. Request messages, including GetS, GetM, PutM and PutS
  2. Forwarded request messages, including Fwd-GetS, Fwd-GetM, Inv (Invalidation), and Put-Ack
  3. Response messages, including Data and Inv-Ack (Invalidation Acknowledgement)

Given the following assumptions:

  1. All caches implement write-back + write-allocate policy
  2. Separate networks for each message type for deadlock prevention
  3. The interconnection network / fabric enforces point-to-point ordering for forwarded request messages, i.e., if a directory sends two forwarded request messages to a cache controller, the messages arrive at that cache controller in order
  4. A complete directory accurately tracks the status and sharers of each cache line
  5. A cache controller will not receive forwarded request messages for cache lines, which have outstanding coherence transactions; the directory controller will not receive request messages for cache lines, which have outstanding data responses

The cache controller state transitions for directory based MSI protocol is shown below:

StateProcessor Core EventsForwarded Request MessagesResponse Messages
LoadStoreEvictionFwd-GetSFwd-GetMInvPut-AckData from DirData from OwnerInv-Ack
IIssue GetSIssue GetMData[ack=0] / S (GetS) or M (GetM)
Data[ack>0] / –
Data[ack=0] / S (GetS) or M (GetM)if (last Inv-Ack)  – / M
else  ack–
SLoad hitIssue GetMIssue PutSSend Inv-Ack to Req / I– / I Data[ack=0] / M
Data[ack>0] / –
if (last Inv-Ack)  – / M
else  ack–
MLoad hitStore hitIssue PutM, send data to DirSend Data[ack=0] to Req & Dir / SSend Data[ack=0] to Req / I– / I

When a cache controller issues “GetM” for a line in State I, the data response could be from either the directory or another owner cache. If the data response is from directory, the requesting cache could expect multiple Inv-Ack messages from other caches, and it cannot modify the cache line until it receives all Inv-Ack messages; If the data response is from another owner cache, the requesting cache does not expect any Inv-Ack messages from other caches.

When a cache line in State S gets evicted, it cannot be evicted silently. Instead, the cache controller must issue a “PutS” request to the directory, letting the directory update its sharer list.

When a cache line in State S or M gets evicted, the cache controller issues “PutS” or “PutM” requests to the directory, and the directory will respond with “Put-Ack” messages.

By denoting directory states from caches’ perspective, the directory state transitions for the protocol is shown accordingly:

StateRequest MessagesResponse Messages
GetSGetMPutSPutM + Data from OwnerData
ISend data to Req, add Req to sharer / SSend data to Req, set Owner as Req / M
SSend data to Req, add Req to sharer / SSend data to req, send Inv to sharers, clear sharers, set Owner to Req / MRemove Req from sharers, send Put-Ack to Req / S (not the last PutS) or I (the last PutS)
MSend Fwd-GetS to Owner, add Req and Owner to sharer, clear OwnerSend Fwd-GetM to Owner, set Owner to ReqCopy data to memory, clear Owner, send Put-Ack to Req / ICopy data to memory / S

When the directory gets a “GetS” request for a line in State M, it sends the “Fwd-GetS” request to the owner cache, and expects a data write back. Once it gets the data response from the previous owner cache, it copies the data to memory and transitions the directory state to State S.

Handling Non-Atomic Requests (By Stalls)

In directory based MSI protocol, a cache controller may receive forwarded request messages for a cache line, which has a coherence transaction outstanding. In addition, the directory controller may receive request messages for a cache line, which has a data response outstanding.

Handling such cases requires adding more transient states in the directory based MSI protocol. In addition, to simplify the transient states handling, we stall the “conflicting” forwarded request messages to cache controllers and the “conflicting” request messages to the directory.

Given the following assumptions:

  1. All caches implement write-back + write-allocate policy
  2. Separate networks for each message type for deadlock prevention
  3. The interconnection network / fabric enforces point-to-point ordering for forwarded request messages, i.e., if a directory sends two forwarded request messages to a cache controller, the messages arrive at that cache controller in order
  4. A complete directory accurately tracks the status and sharers of each cache line

The cache controller state transitions for directory based MSI protocol is shown below:

StateProcessor Core EventsForwarded Request MessagesResponse Messages
LoadStoreEvictionFwd-GetSFwd-GetMInvPut-AckData from DirData from OwnerInv-Ack
IIssue GetS / IS-DIssue GetM / IM-AD
IS-DStallStallStallStallData[ack=0] / SData[ack=0] / S
IM-ADStallStallStallStallStallData[ack=0] / M
Data[ack>0] / IM-A
Data[ack=0] / Mack–
IM-AStallStallStallStallStallif (last Inv-Ack)  – / M
else  ack–
SLoad hitIssue GetM / SM-ADIssue PutS / SI-ASend Inv-Ack to Req / I– 
SM-ADHitStallStallStallStallSend Inv-Ack to Req / IM-ADData[ack=0] / M
Data[ack>0] / SM-A
ack–
SM-AHitStallStallStallStallif (last Inv-Ack)  – / M
else  ack–
MLoad hitStore hitIssue PutM, send data to Dir / MI-ASend Data[ack=0] to Req & Dir / SSend Data[ack=0] to Req / I
MI-AStallStallStallSend Data[ack=0] to Req & Dir / SI-ASend Data[ack=0] to Req / II-A– / I
SI-AStallStallStallSend Inv-Ack to Req / II-A– / I
II-AStallStallStall– / I

In the table above, transient states are represented in the form of “XY-AD”, where “A” stands for acknowledgement outstanding, and “D” stands for data response outstanding.

When a cache controller processes a load miss, the cache line state first transitions to “IS-D”. In the meantime, the cache controller may receive an Inv message for the same line (because data responses and Inv messages travel on separate networks). The cache controller simply stalls the Inv message until the data response returns and the cache line state eventually transitions to State S.

Similarly, when a cache controller processes a store miss, the cache line state first transitions to either “IM-AD” or “SM-AD” to wait for data responses, then possibly transitions to either “IM-A” or “SM-A” to wait for any expected Inv messages. In the meantime, the cache controller may receive “Fwd-GetS” or “Fwd-GetM” requests for the same cache line. The cache controller simply stalls these forwarded requests until the cache line eventually reaches the stable State M.

In particular, when a cache line is in State “SM-AD” (logically the same as State S), it must respond to Inv messages. The cache controller transitions the cache line to State “IM-AD”, to prevent further load hits.

When a cache line in State S gets evicted, it cannot be evicted silently. Instead, the cache controller must issue a “PutS” request to the directory, and transition the cache line state to “SI-A” first. Before the cache controller receives the “Put-Ack” from the directory, the cache line state is logically the same as State S. If the cache controller receives an Inv message for the line, it needs to respond and transition the cache line state to “II-A”. State “II-A” is logically the same as State I, but it denotes that the cache controller is still waiting for a “Put-Ack” to complete the eviction.

Similarly, when a cache line in State M gets evicted, the cache controller issues a “PutM” request to the directory, and transitions the cache line state to “MI-A” first. Before the cache controller receives the “Put-Ack” from the directory, the cache line state is logically the same as State M. If the cache controller receives “Fwd-GetS” or “Fwd-GetM” requests, it needs to respond and transition the cache line state to “SI-A” or “II-A”. These states are logically the same as State S or I, but it denotes that the cache controller is still waiting for a “Put-Ack” to complete the eviction. 

By denoting directory states from caches’ perspective, the directory state transitions for the protocol is shown accordingly:

StateRequest MessagesResponse Messages
GetSGetMPutSPutM + Data from OwnerPutM + Data from Non-OwnerData
ISend data to Req, add Req to sharer / SSend data to Req, set Owner as Req / MSend Put-Ack to ReqSend Put-Ack to Req
SSend data to Req, add Req to sharer / SSend data to req, send Inv to sharers, clear sharers, set Owner to Req / MRemove Req from sharers, send Put-Ack to Req / S (not the last PutS) or I (the last PutS)Remove Req from sharers, send Put-Ack to Req
MSend Fwd-GetS to Owner, add Req and Owner to sharer, clear Owner / S-DSend Fwd-GetM to Owner, set Owner to ReqSend Put-Ack to ReqCopy data to memory, clear Owner, send Put-Ack to Req / ISend Put-Ack to Req
S-DStallStallRemove Req from sharers, send Put-AckRemove Req from sharers, send Put-AckCopy data to memory / S

At first glance, it might be confusing to see “PutM + Data from Non-Owner” in directory State S. Say Core 0 evicts a line in State M and issues a “PutM” request to the directory. Before the “PutM” request from Core 0 reaches the directory, the directory may have already served “GetS” from Core 1 by sending “Fwd-GetS” to Core 0, and the directory may have already changed to State S. This is possible since all links are point-to-point, and there are no ordering requirements among different message classes.

Similarly, it is possible to see “PutS” and “PutM + Data from Non-Owner” in directory State M. Say Core 0 evicts a line in State S or M, and issues a “PutS” or “PutM” request to the directory. Before the “PutS” / “PutM” from Core 0 reaches the directory, the directory may have already served “GetM” from Core 1 by sending “Fwd-GetM” to Core 0, and the directory has already changed to State M.

Taking one step further, Core 0 evicts a line in State S or M, and issues “PutS” or “PutM” requests to the directory. Before the “PutS” / “PutM” from Core 0 reaches the directory, the directory may have already served “GetS” / “GetM” (by sending “Fwd-GetS” / “Fwd-GetM” to Core 0) followed by “PutS” / “PutM” from Core 1. When the directory is in State I, it is possible to see “PutS” and “PutM + Data from Non-Owner” as well.

Again, all above scenarios we discussed can only happen when allowing a cache controller to receive a forwarded request message for a cache line, which also has a coherence transaction outstanding.

Reference

A Primer on Memory Consistency and Cache Coherence (Second Edition), by Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill, David A. Wood

Subscribe

Enter your email to get updates from us. You might need to check the spam folder for the confirmation email.

Leave a comment