How to handle coherent DMA?

A coherent DMA operation that reads memory should get the most recent version of data, even if the data resides in a cache in state M or O. Similarly, a coherent DMA operation that writes memory must invalidate stale copies in all caches.

Though it is straightforward to handle coherent DMA by adding a coherent cache to the DMA controller, it it not desirable for a couple of reasons:

  1. DMA operations have quite different locality patterns than CPU cores, and they stream through memory with very little temporal reuse
  2. When DMA writes data, it generally writes the entire cache line. Thus getting state M (with data response) before DMA writes is wasteful, since the entire data will be overwritten

Possible optimizations include:

  1. Adding support for getting state M without data response
  2. Make DMA work without hardware cache coherence support, by requiring the OS to selectively flush caches. However, explicit OS control is typically implemented in page granularity, instead of cache line granularity, making this approach inefficient. This approach is typically seen only in embedded systems, since OS must conservatively flush a page even if none of its cache lines are in any cache

Reference

A Primer on Memory Consistency and Cache Coherence (Second Edition), by Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill, David A. Wood

Subscribe

Enter your email to get updates from us. You might need to check the spam folder for the confirmation email.

Leave a comment