Architecture

  • How to optimize coherence conflict / false sharing?

    How to optimize coherence conflict / false sharing?

    Coherence conflict happens when two cores compete for the read & write access for the cache line. False sharing, in particular, is a special type of coherence conflict, where two cores read and write different data that happen to reside in the same cache line.…

  • How to handle coherent DMA?

    How to handle coherent DMA?

    A coherent DMA operation that reads memory should get the most recent version of data, even if the data resides in a cache in state M or O. Similarly, a coherent DMA operation that writes memory must invalidate stale copies in all caches. Though it…

  • What are the advantages and disadvantages of using write-through cache in cache coherence protocol?

    What are the advantages and disadvantages of using write-through cache in cache coherence protocol?

    We assumed using write-back caches in cache coherence protocols. It is possible to use write-through caches as well. Advantages of using write-through caches: Disadvantage of using write-through cache: Reference A Primer on Memory Consistency and Cache Coherence (Second Edition), by Vijay Nagarajan, Daniel J. Sorin,…

  • Handling non-atomic requests in directory based MSI protocol (II)

    Handling non-atomic requests in directory based MSI protocol (II)

    In the previous post, we discussed about handling non-atomic requests in directory based MSI protocol by stalls. In cache controller transient states such as “IS-D”, “IM-A” and “SM-A”, we could allow forwarded request messages to make progress without stalling, at the expense of adding more…

  • Handling non-atomic requests in directory based MSI protocol (I)

    Handling non-atomic requests in directory based MSI protocol (I)

    Just like snooping based protocols, directory based cache coherence protocol has to handle non-atomic requests in real world implementations. We start from the a directory based MSI base model, and discuss one solution on non-atomic requests handling. Base Model In directory based MSI protocol, there…

  • Handling non-atomic operations in snooping based MSI protocol (II)

    Handling non-atomic operations in snooping based MSI protocol (II)

    In the previous post, we discussed non-zero delay from coherence requests to responses. However, coherence requests may also be non-atomic: a coherence request may not be instantly ordered when it is issued by a cache controller. For example, if there is a request queue between…

  • Handling non-atomic operations in snooping based MSI protocol (I)

    Handling non-atomic operations in snooping based MSI protocol (I)

    In cache coherence protocol implementation, designers must properly handle non-atomic operations, since coherence transactions cannot complete instantly. We start from the well-known snooping based MSI protocol base model (without considering atomicity), and discuss atomicity handling in real world. Base Model Given the following assumptions: The…

  • Understand Cache Coherence from Memory Model’s Perspective

    Understand Cache Coherence from Memory Model’s Perspective

    To understand cache coherence, we have to take one step back and look at the memory model first. What is a Memory Model? Memory consistency model, or memory model, dictates the order in which memory reads and writes (or loads and stores) get applied to…

  • From CPU ISA to CPU Microcode Hacking

    From CPU ISA to CPU Microcode Hacking

    The Google security team identified EntrySign, an AMD Zen-based CPU security vulnerability issue. This is a perfect opportunity to understand various CPU instruction concepts, including ISA, CISC, microcode, and microcode patching. What is ISA? An Instruction Set Architecture (ISA) defines the fundamental instruction set a…

  • Design a DDR Memory Controller (V) – Timing Parameters

    Design a DDR Memory Controller (V) – Timing Parameters

    Before the memory control issues a command to DDR devices, it checks if certain timing parameters are met, otherwise the DDR device cannot work properly. There are tons of timing parameters in JEDEC standards, and each DDR generation / device type could have different timing…


Read Our Books for Free with Kindle Unlimited

Our books are available on Kindle Unlimited for free. Plus, you get unlimited access to hundreds of other books for preparing hardware interviews, including our recommended reading list

* Chipress participates in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com

https://amzn.to/3yBJqRo

Subscribe

Enter your email to get updates from us. You might need to check the spam folder for the confirmation email.