Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

13 Jul 2021

      On Tue, Jul 13, 2021 at 05:13:37PM +0100, Matthew Auld wrote:
...
On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
ville.syrjala@linux.intel.com wrote:
...
On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
...

/**

 * @cache_coherent:

 *

 * Track whether the pages are coherent with the GPU if reading or

 * writing through the CPU cache.

 *

 * This largely depends on the @cache_level, for example if the object

 * is marked as I915_CACHE_LLC, then GPU access is coherent for both

 * reads and writes through the CPU cache.

 *

 * Note that on platforms with shared-LLC support(HAS_LLC) reads through

 * the CPU cache are always coherent, regardless of the @cache_level. On

 * snooping based platforms this is not the case, unless the full

 * I915_CACHE_LLC or similar setting is used.

 *

 * As a result of this we need to track coherency separately for reads

 * and writes, in order to avoid superfluous flushing on shared-LLC

 * platforms, for reads.

 *

 * I915_BO_CACHE_COHERENT_FOR_READ:

 *

 * When reading through the CPU cache, the GPU is still coherent. Note

 * that no data has actually been modified here, so it might seem

 * strange that we care about this.

 *

 * As an example, if some object is mapped on the CPU with write-back

 * caching, and we read some page, then the cache likely now contains

 * the data from that read. At this point the cache and main memory

 * match up, so all good. But next the GPU needs to write some data to

 * that same page. Now if the @cache_level is I915_CACHE_NONE and the

 * the platform doesn't have the shared-LLC, then the GPU will

 * effectively skip invalidating the cache(or however that works

 * internally) when writing the new value.  This is really bad since the

 * GPU has just written some new data to main memory, but the CPU cache

 * is still valid and now contains stale data. As a result the next time

 * we do a cached read with the CPU, we are rewarded with stale data.

 * Likewise if the cache is later flushed, we might be rewarded with

 * overwriting main memory with stale data.

 *

 * I915_BO_CACHE_COHERENT_FOR_WRITE:

 *

 * When writing through the CPU cache, the GPU is still coherent. Note

 * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.

 *

 * This is never set when I915_CACHE_NONE is used for @cache_level,

 * where instead we have to manually flush the caches after writing

 * through the CPU cache. For other cache levels this should be set and

 * the object is therefore considered coherent for both reads and writes

 * through the CPU cache.

I don't remember why we have this read vs. write split and this new
documentation doesn't seem to really explain it either.
Hmm, I attempted to explain that earlier:

Note that on platforms with shared-LLC support(HAS_LLC) reads through
the CPU cache are always coherent, regardless of the @cache_level. On
snooping based platforms this is not the case, unless the full
I915_CACHE_LLC or similar setting is used.

As a result of this we need to track coherency separately for reads
and writes, in order to avoid superfluous flushing on shared-LLC
platforms, for reads.

So AFAIK it's just because shared-LLC can be coherent for reads, while
also not being coherent for writes(CACHE_NONE),
CPU vs. GPU is fully coherent when it comes to LLC. Or at least I've
never heard of any mechanism that would make it only partially coherent.
-- 
Ville Syrjälä
Intel

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits