-
Handling non-atomic requests in directory based MSI protocol (II)
In the previous post, we discussed about handling non-atomic requests in directory based MSI protocol by stalls. In cache controller transient states such as “IS-D”, “IM-A” and “SM-A”, we could allow forwarded request messages to make progress without stalling, at the expense of adding more transient states. For example, when a cache controller has a…
-
Handling non-atomic requests in directory based MSI protocol (I)
Just like snooping based protocols, directory based cache coherence protocol has to handle non-atomic requests in real world implementations. We start from the a directory based MSI base model, and discuss one solution on non-atomic requests handling. Base Model In directory based MSI protocol, there are 3 message types: Given the following assumptions: The cache…
-
Handling non-atomic operations in snooping based MSI protocol (II)
In the previous post, we discussed non-zero delay from coherence requests to responses. However, coherence requests may also be non-atomic: a coherence request may not be instantly ordered when it is issued by a cache controller. For example, if there is a request queue between a cache controller and the system bus, coherence request atomicity…
-
Handling non-atomic operations in snooping based MSI protocol (I)
In cache coherence protocol implementation, designers must properly handle non-atomic operations, since coherence transactions cannot complete instantly. We start from the well-known snooping based MSI protocol base model (without considering atomicity), and discuss atomicity handling in real world. Base Model Given the following assumptions: The cache controller state transitions for snooping based MSI protocol is…
-
How to avoid unintended fixed priority arbiter usage in RTL?
We discussed how to optimize PPA in RTL coding, and we will cover how to avoid unintended fixed priority arbiters in RTL. Fixed priority arbiters are “expensive” in RTL implementation. The more requests the arbiter has, the more level of logic the final grant will have. Let’s say we want to design a fixed priority…
-
Understanding Power Analysis & Estimation: 2 Recommended Readings
As SoCs are getting more complex, power becomes just as important as functionality correctness or performance. This article, “Unified Methodology for Effective Correlation of SoC Power Estimation and Signoff” by Infineon Technologies, addresses the growing challenges in accurately estimating and correlating power in complex SoCs. This article is a good start to understand how the…
-
What we learnt from Gemini Prompting Guide 101
Google released its “Gemini Prompting Guide 101” a while ago. Though the majority of the examples in this guide use Google Workspace for illustration purposes, it still provides a general ideal of writing effective prompts for all LLMs. The guide Four first illustrates the 4 main areas for effective prompts: It then lists a quite…
-
Understand Cache Coherence from Memory Model’s Perspective
To understand cache coherence, we have to take one step back and look at the memory model first. What is a Memory Model? Memory consistency model, or memory model, dictates the order in which memory reads and writes (or loads and stores) get applied to coherence shared memory systems. In the world of computer architecture,…
-
DFT (VIII) – How does DFT test SRAMs? What is the Memory Built-In Selft Test (MBIST)?
SRAM Fault Model Similar to the logic fault model, SRAMs can have stuck-at faults and open faults in memory cells. In addition, SRAMs can have other faults, including: Note, SRAM read and write logic, such as sense amplifiers and I/O buffers can have defects, but their faults are equivalent to memory cell faults. How does…
-
What is auto-ungrouping? How does it impact the implementation flow?
Besides boundary optimization, auto-ungroup is another important synthesis optimization technique. By flattening design hierarchies for the benefit of PPA, it enables cross boundary optimization, and removes logic duplication, which often occurs for shared signals across replicated modules. Auto-ungrouping will introduce hierarchy naming changes. For example, ungrouped hierarchies will use underscore “_” instead of slash “/”…
