[Intel-gfx] [PATCH] drm/i915 : Avoid superfluous invalidation of CPU cache lines

Chris Wilson chris at chris-wilson.co.uk
Tue Nov 24 02:10:06 PST 2015


On Tue, Nov 24, 2015 at 03:35:24PM +0530, akash.goel at intel.com wrote:
> From: Akash Goel <akash.goel at intel.com>
> 
> When the object is moved out of CPU read domain, the cachelines
> are not invalidated immediately. The invalidation is deferred till
> next time the object is brought back into CPU read domain.
> But the invalidation is done unconditionally, i.e. even for the case
> where the cachelines were flushed previously, when the object moved out
> of CPU write domain. This is avoidable and would lead to some optimization.
> Though this is not a hypothetical case, but is unlikely to occur often.
> The aim is to detect changes to the backing storage whilst the
> data is potentially in the CPU cache, and only clflush in those case.
 
Testcase: igt/gem_concurrent_blit 
Testcase: igt/benchmarks/gem_set_domain
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Akash Goel <akash.goel at intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 1 +
>  drivers/gpu/drm/i915/i915_gem.c | 9 ++++++++-
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index df9316f..fedb71d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2098,6 +2098,7 @@ struct drm_i915_gem_object {
>  	unsigned long gt_ro:1;
>  	unsigned int cache_level:3;
>  	unsigned int cache_dirty:1;
> +	unsigned int cache_clean:1;
>  
>  	unsigned int frontbuffer_bits:INTEL_FRONTBUFFER_BITS;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 19c282b..a13ffd4 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3552,6 +3552,7 @@ i915_gem_clflush_object(struct drm_i915_gem_object *obj,
>  	trace_i915_gem_object_clflush(obj);
>  	drm_clflush_sg(obj->pages);
>  	obj->cache_dirty = false;
> +	obj->cache_clean = true;
>  
>  	return true;
>  }
> @@ -3982,7 +3983,13 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
>  
>  	/* Flush the CPU cache if it's still invalid. */
>  	if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) {
> -		i915_gem_clflush_object(obj, false);
> +		/* Invalidation not needed as there should not be any data in
> +		 * CPU cache lines for this object, since clflush would have
> +		 * happened when the object last moved out of CPU write domain.
> +		 */

/* If an object is moved out of the CPU domain following a CPU write
 * and before a GPU or GTT write, we will clflush it out of the CPU cache,
 * and mark the cache as clean. As the object has not been accessed on the CPU
 * since (i.e. the CPU cache is still clean and it is out of the CPU domain),
 * we know that no CPU cache line contains stale data and so we can skip
 * invalidating the CPU cache in preparing to read from the object.
 */

Marginally more verbose in stating the sequence of events for which we
can ignore the clflush invalidate.

Please Cc: Ville Syrjälä <ville.syrjala at linux.intel.com> as I trust his
criticisms here.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list