Handling non-atomic operations in snooping based MSI protocol (II)

In the previous post, we discussed non-zero delay from coherence requests to responses. However, coherence requests may also be non-atomic: a coherence request may not be instantly ordered when it is issued by a cache controller. For example, if there is a request queue between a cache controller and the system bus, coherence request atomicity is no longer guaranteed, and this is a fairly common implementation.

Considering non-atomic coherence requests adds even more intermediate states into the protocol. For instance, a cache controller intends to change a line from State I to State S, and issues a “GetS” request to the system bus. Until the cache controller recognizes its “GetS” is ordered on the system bus, the cache line state is denoted as “IS-AD”, which is logically the same as State I. Once the cache controller sees its own “GetS” request, the cache line state transitions to “IS-D” (logically the same as State S) before data response is returned.

Given the following assumptions:

All caches implement write-back + write-allocate policy
Coherence transactions are atomic, i.e., a subsequent coherence request for the same cache line shall be stalled on the system bus until after the first coherence transaction for that line completes

The cache controller state transitions for snooping based MSI protocol is shown below:

State	Processor Core Events			System Bus Event
	Processor Core Events			Own Transaction				Coherence Req for Other Cores
	Load	Store	Eviction	Own-GetS	Own-GetM	Own-PutM	Data Rsp	Other-GetS	Other-GetM	Other-PutM
I	Issue GetS / IS-AD	Issue GetM / IM-AD	–	–	–	–	–	–	–	–
IS-AD	Stall	Stall	Stall	– / IS-D	–	–	–	–	–	–
IS-D	Stall	Stall	Stall	–	–	–	Copy data into cache, load hit / S	–	–	–
IM-AD	Stall	Stall	Stall	–	– / IM-D	–	–	–	–	–
IM-D	Stall	Stall	Stall	–	–	–	Copy data into cache, store hit / M	–	–	–
S	Hit	Issue GetM / SM-AD	– / I	–	–	–	–	–	– / I	–
SM-AD	Hit	Stall	Stall	–	– / SM-D	–	–	–	– / IM-AD	–
SM-D	Hit	Stall	Stall	–	–	–	Copy data into cache, store hit / M	–	–	–
M	Hit	Hit	Issue PutM / MI-A	–	–	–	–	Send data to req & memory / S	Send data to req / I	–
MI-A	Hit	Hit	Stall	–	–	Send data to memory / I	–	Send data to req & memory / II-A	Send data to req / II-A	–
II-A	Stall	Stall	Stall	–	–	Send NoData to memory / I	–	–	–	–

Note, if a core stores to a line in State S, the cache controller issues a “GetM” request and transitions to “SM-AD” state, before the “GetM” request is ordered and recognized by the cache controller. State “SM-AD” is logically the same as State S, thus loads can still proceed and the cache controller ignores “Other-GetS” requests. However, if an “Other-GetM” is ordered and recognized first, the cache controller must transition the state to “IM-AD” to prevent further load hits.

The State M to State I downgrade requires special care as well. When a cache line in State M gets evicted, the cache controller issues a “PutM” request and changes the cache line state to “MI-A”. If another core sends “GetS” or “GetM” for that line while it is still in State “MI-A”, the cache controller must respond as if it is still in State M, and transition to State “II-A” to wait for its own “PutM” to be ordered. Once the cache controller recognizes its own “PutM”, it cannot simply transition to State I, otherwise it will leave the memory stuck in a transient state, as the memory has already seen the “PutM” request. The cache controller cannot send the data to memory either, since the data may have already been modified in another core. The solution is to send a special “NoData” message to the memory, signaling such a message is from a non-owner and letting the memory exit the intermediate state.

Memory state transitions for the protocol is shown accordingly:

State	System Bus Event
State	GetS	GetM	PutM	Data from Owner	NoData
IorS	Send data as Data Rsp to req / IorS	Send data as Data Rsp to req / M	– / IorS-D	–	–
IorS-D	–	–	–	Update data in memory / IorS	– / IorS
M	– / IorS-D	–	– / M-D	–	–
M-D	–	–	–	Update data in memory / IorS	– / M

To elaborate all scenarios for State M to State I downgrade, when the cache line is in State “MI-A”:

If another core sends “GetS” before the “Own-PutM” request is ordered, the cache controller transitions the cache line state to “II-A”, and serves the “GetS” by sending data response to the requesting core and the memory. Once the “Own-PutM” request is ordered, the memory state must be in State “IorS” (data response has updated the memory), and later transitions to State “IorS-D” (in response to the “PutM” request). After receiving “NoData” message, the memory state transitions back to “IorS”
If another core sends “GetM” before the “Own-PutM” request is ordered, the cache controller transitions the cache line State to “II-A” and serves the “GetM” by sending data response to the requesting core. Once the “Own-PutM” request is ordered, the memory state must be in State M (the owner is another core), and later transitions to State “M-D” (in response of the “PutM” request), in which the memory expects a “NoData” message from the cache controller
If no other cores send “GetS” or “GetM” before the “Own-PutM” request is ordered, the memory state transitions to “M-D” upon recognizing the “PutM” request; after the data response arrives at the memory, the memory state transitions to “IorS”

Obviously, we need to differentiate the memory state “IorS-D” and “M-D”, as they handle “NoData” messages differently.

Reference

A Primer on Memory Consistency and Cache Coherence (Second Edition), by Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill, David A. Wood

Chipress

Handling non-atomic operations in snooping based MSI protocol (II)

Subscribe

Leave a comment Cancel reply

Handling non-atomic operations in snooping based MSI protocol (II)

Spread the Words:

Subscribe

Leave a comment Cancel reply