[Intel-gfx] [PATCH] drm/i915 : Avoid superfluous invalidation of CPU cache lines

Wed Nov 25 09:28:51 PST 2015

On Wed, Nov 25, 2015 at 01:02:20PM +0200, Ville Syrjälä wrote:
> On Tue, Nov 24, 2015 at 10:39:38PM +0000, Chris Wilson wrote:
> > On Tue, Nov 24, 2015 at 07:14:31PM +0100, Daniel Vetter wrote:
> > > On Tue, Nov 24, 2015 at 12:04:06PM +0200, Ville Syrjälä wrote:
> > > > On Tue, Nov 24, 2015 at 03:35:24PM +0530, akash.goel at intel.com wrote:
> > > > > From: Akash Goel <akash.goel at intel.com>
> > > > > 
> > > > > When the object is moved out of CPU read domain, the cachelines
> > > > > are not invalidated immediately. The invalidation is deferred till
> > > > > next time the object is brought back into CPU read domain.
> > > > > But the invalidation is done unconditionally, i.e. even for the case
> > > > > where the cachelines were flushed previously, when the object moved out
> > > > > of CPU write domain. This is avoidable and would lead to some optimization.
> > > > > Though this is not a hypothetical case, but is unlikely to occur often.
> > > > > The aim is to detect changes to the backing storage whilst the
> > > > > data is potentially in the CPU cache, and only clflush in those case.
> > > > > 
> > > > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > > > Signed-off-by: Akash Goel <akash.goel at intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_drv.h | 1 +
> > > > >  drivers/gpu/drm/i915/i915_gem.c | 9 ++++++++-
> > > > >  2 files changed, 9 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > > > index df9316f..fedb71d 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > @@ -2098,6 +2098,7 @@ struct drm_i915_gem_object {
> > > > >  	unsigned long gt_ro:1;
> > > > >  	unsigned int cache_level:3;
> > > > >  	unsigned int cache_dirty:1;
> > > > > +	unsigned int cache_clean:1;
> > > > 
> > > > So now we have cache_dirty and cache_clean which seems redundant,
> > > > except somehow cache_dirty != !cache_clean?
> > 
> > Exactly, not entirely redundant. I did think something along MESI lines
> > would be useful, but that didn't capture the different meanings we
> > employ.
> > 
> > cache_dirty tracks whether we have been eliding the clflush.
> > 
> > cache_clean tracks whether we know the cache has been completely
> > clflushed.
> 
> Can we know that with speculative prefetching and whatnot?

"The memory attribute of the page containing the affected line has no
effect on the behavior of this instruction. It should be noted that
processors are free to speculative fetch and cache data from system
memory regions assigned a memory-type allowing for speculative reads
(i.e. WB, WC, WT memory types). The Streaming SIMD Extensions PREFETCHh
instruction is considered a hint to this speculative behavior. Because
this speculative fetching can occur at any time and is not tied to
instruction execution, CLFLUSH is not ordered with respect to PREFETCHh
or any of the speculative fetching mechanisms (that is, data could be
speculative loaded into the cache just before, during, or after the
execution of a CLFLUSH to that cache line)."

which taken to the extreme means that we can't get away with this trick.

If we can at least guarantee that such speculation can't extend beyond
a page boundary that will be enough to assert that the patch is valid.

Hopefully someone knows a CPU guru or two.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre