Handling non-atomic requests in directory based MSI protocol (II)

In the previous post, we discussed about handling non-atomic requests in directory based MSI protocol by stalls. In cache controller transient states such as “IS-D”, “IM-A” and “SM-A”, we could allow forwarded request messages to make progress without stalling, at the expense of adding more transient states.

For example, when a cache controller has a line in State “IS-D” and receives an Inv message, it processes the request and changes the line’ state to “IS-D-I”, indicating the cache controller should change the line state to I after the “GetS” transaction completes. By not stalling the Inv message, the cache controller can improve its performance by continuing to process other forwarded request messages behind that Inv message in the queue.

Given the following assumptions:

All caches implement write-back + write-allocate policy
Separate networks for each message type for deadlock prevention
The interconnection network / fabric enforces point-to-point ordering for forwarded request messages, i.e., if a directory sends two forwarded request messages to a cache controller, the messages arrive at that cache controller in order
A complete directory accurately tracks the status and sharers of each cache line

The cache controller state transitions for directory based MSI protocol is shown below:

State	Processor Core Events			Forwarded Request Messages				Response Messages
State	Load	Store	Eviction	Fwd-GetS	Fwd-GetM	Inv	Put-Ack	Data from Dir	Data from Owner	Inv-Ack
I	Issue GetS / IS-D	Issue GetM / IM-AD	–	–	–	–	–	–	–	–
IS-D	Stall	Stall	Stall	–	–	Send Inv-Ack to Req / IS-D-I	–	Data[ack=0] / S	Data[ack=0] / S	–
IS-D-I	Stall	Stall	Stall	–	–	–	–	Data[ack=0] / I	Data[ack=0] / I
IM-AD	Stall	Stall	Stall	Stall	Stall	–	–	Data[ack=0] / M Data[ack>0] / IM-A	Data[ack=0] / M	ack–
IM-A	Stall	Stall	Stall	– / IM-A-S	– / IM-A-I	–	–	–	–	if (last Inv-Ack) – / M else ack–
IM-A-S	Stall	Stall	Stall	–	–	Send Inv-Ack to Req / IM-A-SI	–	–	–	if (last Inv-Ack) Send data to Req & Dir / S else ack–
IM-A-SI	Stall	Stall	Stall	–	–	–	–	–	–	if (last Inv-Ack) Send data to Req & Dir / I else ack–
IM-A-I	Stall	Stall	Stall	–	–	–	–	–	–	if (last Inv-Ack) Send data to Req / I else ack–
S	Load hit	Issue GetM / SM-AD	Issue PutS / SI-A	–	–	Send Inv-Ack to Req / I	–	–	–	–
SM-AD	Hit	Stall	Stall	Stall	Stall	Send Inv-Ack to Req / IM-AD	–	Data[ack=0] / M Data[ack>0] / SM-A	–	ack–
SM-A	Hit	Stall	Stall	– / SM-A-S	– / SM-A-I	–	–	–	–	if (last Inv-Ack) – / M else ack–
SM-A-S	Stall	Stall	Stall	–	–	Send Inv-Ack to Req / SM-A-SI	–	–	–	if (last Inv-Ack) Send data to Req & Dir / I else ack–
SM-A-SI	Stall	Stall	Stall	–	–	–	–	–	–	if (last Inv-Ack) Send data to Req & Dir / I else ack–
SM-A-I	Stall	Stall	Stall	–	–	–	–	–	–	if (last Inv-Ack) Send data to Req / I else ack–
M	Load hit	Store hit	Issue PutM, send data to Dir / MI-A	Send Data[ack=0] to Req & Dir / S	Send Data[ack=0] to Req / I	–	–	–	–	–
MI-A	Stall	Stall	Stall	Send Data[ack=0] to Req & Dir / SI-A	Send Data[ack=0] to Req / II-A	–	– / I	–	–	–
SI-A	Stall	Stall	Stall	–	–	Send Inv-Ack to Req / II-A	– / I	–	–	–
II-A	Stall	Stall	Stall	–	–	–	– / I	–	–	–

In the above table, for states like “IM-A-*” and “SM-A-*”, they enable forward progress when the cache controller is still gathering “Inv-Ack” messages from other caches after issuing “GetM” requests to the directory.

Note, for states like “IM-A-S*” and “SM-A-S*” (transitions caused by “Fwd-GetS” requests), the cache controller should send data response to both the requesting cache and the directory, after receiving all expected “Inv-Ack” messages; for states like “IM-A-I” and “SM-A-I” (transitions caused by “Fwd-GetM” requests), the cache controller should send data response to only the requesting cache, not the directory.

By denoting directory states from caches’ perspective, the directory state transitions for the protocol is shown accordingly:

State	Request Messages					Response Messages
State	GetS	GetM	PutS	PutM + Data from Owner	PutM + Data from Non-Owner	Data
I	Send data to Req, add Req to sharer / S	Send data to Req, set Owner as Req / M	Send Put-Ack to Req	–	Send Put-Ack to Req	–
S	Send data to Req, add Req to sharer / S	Send data to req, send Inv to sharers, clear sharers, set Owner to Req / M	Remove Req from sharers, send Put-Ack to Req / S (not the last PutS) or I (the last PutS)	–	Remove Req from sharers, send Put-Ack to Req	–
M	Send Fwd-GetS to Owner, add Req and Owner to sharer, clear Owner / S-D	Send Fwd-GetM to Owner, set Owner to Req	Send Put-Ack to Req	Copy data to memory, clear Owner, send Put-Ack to Req / I	Send Put-Ack to Req	–
S-D	Stall	Stall	Remove Req from sharers, send Put-Ack	–	Remove Req from sharers, send Put-Ack	Copy data to memory / S

Note, we still require stalls in transient state “S-D”, otherwise we would need to add an impractically large number of states to avoid stalling in all possible cases.

Reference

A Primer on Memory Consistency and Cache Coherence (Second Edition), by Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill, David A. Wood

Handling non-atomic requests in directory based MSI protocol (II)

Spread the Words:

Subscribe

Leave a comment Cancel reply