On Tue, Jul 13, 2021 at 05:13:37PM +0100, Matthew Auld wrote:
On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä ville.syrjala@linux.intel.com wrote:
On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
/**
* @cache_coherent:
*
* Track whether the pages are coherent with the GPU if reading or
* writing through the CPU cache.
*
* This largely depends on the @cache_level, for example if the object
* is marked as I915_CACHE_LLC, then GPU access is coherent for both
* reads and writes through the CPU cache.
*
* Note that on platforms with shared-LLC support(HAS_LLC) reads through
* the CPU cache are always coherent, regardless of the @cache_level. On
* snooping based platforms this is not the case, unless the full
* I915_CACHE_LLC or similar setting is used.
*
* As a result of this we need to track coherency separately for reads
* and writes, in order to avoid superfluous flushing on shared-LLC
* platforms, for reads.
*
* I915_BO_CACHE_COHERENT_FOR_READ:
*
* When reading through the CPU cache, the GPU is still coherent. Note
* that no data has actually been modified here, so it might seem
* strange that we care about this.
*
* As an example, if some object is mapped on the CPU with write-back
* caching, and we read some page, then the cache likely now contains
* the data from that read. At this point the cache and main memory
* match up, so all good. But next the GPU needs to write some data to
* that same page. Now if the @cache_level is I915_CACHE_NONE and the
* the platform doesn't have the shared-LLC, then the GPU will
* effectively skip invalidating the cache(or however that works
* internally) when writing the new value. This is really bad since the
* GPU has just written some new data to main memory, but the CPU cache
* is still valid and now contains stale data. As a result the next time
* we do a cached read with the CPU, we are rewarded with stale data.
* Likewise if the cache is later flushed, we might be rewarded with
* overwriting main memory with stale data.
*
* I915_BO_CACHE_COHERENT_FOR_WRITE:
*
* When writing through the CPU cache, the GPU is still coherent. Note
* that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
*
* This is never set when I915_CACHE_NONE is used for @cache_level,
* where instead we have to manually flush the caches after writing
* through the CPU cache. For other cache levels this should be set and
* the object is therefore considered coherent for both reads and writes
* through the CPU cache.
I don't remember why we have this read vs. write split and this new documentation doesn't seem to really explain it either.
Hmm, I attempted to explain that earlier:
- Note that on platforms with shared-LLC support(HAS_LLC) reads through
- the CPU cache are always coherent, regardless of the @cache_level. On
- snooping based platforms this is not the case, unless the full
- I915_CACHE_LLC or similar setting is used.
- As a result of this we need to track coherency separately for reads
- and writes, in order to avoid superfluous flushing on shared-LLC
- platforms, for reads.
So AFAIK it's just because shared-LLC can be coherent for reads, while also not being coherent for writes(CACHE_NONE),
CPU vs. GPU is fully coherent when it comes to LLC. Or at least I've never heard of any mechanism that would make it only partially coherent.