Handling non-atomic operations in snooping based MSI protocol (II)

In the previous post, we discussed non-zero delay from coherence requests to responses. However, coherence requests may also be non-atomic: a coherence request may not be instantly ordered when it is issued by a cache controller. For example, if there is a request queue between a cache controller and the system bus, coherence request atomicity is no longer guaranteed, and this is a fairly common implementation.

Considering non-atomic coherence requests adds even more intermediate states into the protocol. For instance, a cache controller intends to change a line from State I to State S, and issues a “GetS” request to the system bus. Until the cache controller recognizes its “GetS” is ordered on the system bus, the cache line state is denoted as “IS-AD”, which is logically the same as State I. Once the cache controller sees its own “GetS” request, the cache line state transitions to “IS-D” (logically the same as State S) before data response is returned.

Given the following assumptions:

  1. All caches implement write-back + write-allocate policy
  2. Coherence transactions are atomic, i.e., a subsequent coherence request for the same cache line shall be stalled on the system bus until after the first coherence transaction for that line completes

The cache controller state transitions for snooping based MSI protocol is shown below:

StateProcessor Core EventsSystem Bus Event
Own TransactionCoherence Req for Other Cores
LoadStoreEvictionOwn-GetSOwn-GetMOwn-PutMData RspOther-GetSOther-GetMOther-PutM
IIssue GetS / IS-ADIssue GetM / IM-AD
IS-ADStallStallStall– / IS-D 
IS-DStallStallStallCopy data into cache, load hit / S
IM-ADStallStallStall– / IM-D
IM-DStallStallStallCopy data into cache, store hit / M
SHitIssue GetM / SM-AD– / I– / I
SM-ADHitStallStall– / SM-D– / IM-AD
SM-DHitStallStallCopy data into cache, store hit / M
MHitHitIssue PutM / MI-ASend data to req & memory / SSend data to req / I
MI-AHitHitStallSend data to memory / ISend data to req & memory / II-ASend data to req / II-A
II-AStallStallStallSend NoData to memory / I

Note, if a core stores to a line in State S, the cache controller issues a “GetM” request and transitions to “SM-AD” state, before the “GetM” request is ordered and recognized by the cache controller. State “SM-AD” is logically the same as State S, thus loads can still proceed and the cache controller ignores “Other-GetS” requests. However, if an “Other-GetM” is ordered and recognized first, the cache controller must transition the state to “IM-AD” to prevent further load hits.

The State M to State I downgrade requires special care as well. When a cache line in State M gets evicted, the cache controller issues a “PutM” request and changes the cache line state to “MI-A”. If another core sends “GetS” or “GetM” for that line while it is still in State “MI-A”, the cache controller must respond as if it is still in State M, and transition to State “II-A” to wait for its own “PutM” to be ordered. Once the cache controller recognizes its own “PutM”, it cannot simply transition to State I, otherwise it will leave the memory stuck in a transient state, as the memory has already seen the “PutM” request. The cache controller cannot send the data to memory either, since the data may have already been modified in another core. The solution is to send a special “NoData” message to the memory, signaling such a message is from a non-owner and letting the memory exit the intermediate state.

Memory state transitions for the protocol is shown accordingly:

StateSystem Bus Event
GetSGetMPutMData from OwnerNoData
IorSSend data as Data Rsp to req / IorSSend data as Data Rsp to req / M– / IorS-D
IorS-DUpdate data in memory / IorS– / IorS
M– / IorS-D– / M-D
M-DUpdate data in memory / IorS– / M

To elaborate all scenarios for State M to State I downgrade, when the cache line is in State “MI-A”:

  1. If another core sends “GetS” before the “Own-PutM” request is ordered, the cache controller transitions the cache line state to “II-A”, and serves the “GetS” by sending data response to the requesting core and the memory. Once the “Own-PutM” request is ordered, the memory state must be in State “IorS” (data response has updated the memory), and later transitions to State “IorS-D” (in response to the “PutM” request). After receiving “NoData” message, the memory state transitions back to “IorS”
  2. If another core sends “GetM” before the “Own-PutM” request is ordered, the cache controller transitions the cache line State to “II-A” and serves the “GetM” by sending data response to the requesting core. Once the “Own-PutM” request is ordered, the memory state must be in State M (the owner is another core), and later transitions to State “M-D” (in response of the “PutM” request), in which the memory expects a “NoData” message from the cache controller 
  3. If no other cores send “GetS” or “GetM” before the “Own-PutM” request is ordered, the memory state transitions to “M-D” upon recognizing the “PutM” request; after the data response arrives at the memory, the memory state transitions to “IorS”

Obviously, we need to differentiate the memory state “IorS-D” and “M-D”, as they handle “NoData” messages differently.

Reference

A Primer on Memory Consistency and Cache Coherence (Second Edition), by Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill, David A. Wood

Subscribe

Enter your email to get updates from us. You might need to check the spam folder for the confirmation email.

Leave a comment