[Intel-gfx] [PATCH] drm/i915: clflush on pin_to_display after pwrite to UC bo in LLC

Ville Syrjälä ville.syrjala at linux.intel.com
Tue Aug 11 08:10:29 PDT 2015


On Tue, Aug 11, 2015 at 05:56:28PM +0300, Ville Syrjälä wrote:
> On Tue, Aug 11, 2015 at 03:28:27PM +0100, Chris Wilson wrote:
> > On Tue, Aug 11, 2015 at 04:59:14PM +0300, ville.syrjala at linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > 
> > > Currently we don't clflush on pin_to_display if the bo is already
> > > UC/WT and is not in the CPU write domain. This causes problems with
> > > pwrite since pwrite doesn't change the write domain, and it avoids
> > > clflushing on UC/WT buffers on LLC platforms unless the buffer is
> > > currently being scanned out.
> > > 
> > > Fix the problem by marking the cache dirty and adjusting
> > > i915_gem_object_set_cache_level() to clflush when the cache is dirty
> > > even if the cache_level doesn't change.
> > > 
> > > My last attempt [1] at fixing this via write domain frobbing was shot
> > > down, but now with the cache_dirty flag we can do things in a nicer way.
> > > 
> > > [1] http://lists.freedesktop.org/archives/intel-gfx/2014-November/055390.html
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86422
> > > Testcase: igt/kms_pwrite_crc
> > > Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 10 +++++++---
> > >  1 file changed, 7 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 73293b4..73eff2e 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -1005,12 +1005,15 @@ out:
> > >  		if (!needs_clflush_after &&
> > >  		    obj->base.write_domain != I915_GEM_DOMAIN_CPU) {
> > >  			if (i915_gem_clflush_object(obj, obj->pin_display))
> > > -				i915_gem_chipset_flush(dev);
> > > +				needs_clflush_after = true;
> > >  		}
> > >  	}
> > >  
> > >  	if (needs_clflush_after)
> > >  		i915_gem_chipset_flush(dev);
> > > +	else if (obj->cache_level == I915_CACHE_NONE ||
> > > +		 obj->cache_level == I915_CACHE_WT)
> > > +		obj->cache_dirty = true;
> > >  
> > >  	intel_fb_obj_flush(obj, false, ORIGIN_CPU);
> > >  	return ret;
> > 
> > Ok, this seems reasonable.
> > 
> > > @@ -3639,10 +3642,10 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  {
> > >  	struct drm_device *dev = obj->base.dev;
> > >  	struct i915_vma *vma, *next;
> > > -	int ret;
> > > +	int ret = 0;
> > >  
> > >  	if (obj->cache_level == cache_level)
> > > -		return 0;
> > > +		goto out;
> > >  
> > >  	if (i915_gem_obj_is_pinned(obj)) {
> > >  		DRM_DEBUG("can not change the cache level of pinned objects\n");
> > > @@ -3687,6 +3690,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > >  		vma->node.color = cache_level;
> > >  	obj->cache_level = cache_level;
> > >  
> > > +out:
> > >  	if (obj->cache_dirty &&
> > >  	    obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
> > >  	    cpu_write_needs_clflush(obj)) {
> > 
> > This we can do better (and I know I am guilty of the original sin). It
> > just feels a little too loose that we expect pin-to-display-plane should
> > always call set-cache-level (yes, it has to, but it feels like we are
> > putting pin-to-display-plane specifics into set-cache-level).
> > 
> > I think this chunk is much more understandable if we did:
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 5e7fcf77e57a..fc6bcc19cca1 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3656,13 +3656,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >                 vma->node.color = cache_level;
> >         obj->cache_level = cache_level;
> >  
> > -       if (obj->cache_dirty &&
> > -           obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
> > -           cpu_write_needs_clflush(obj)) {
> > -               if (i915_gem_clflush_object(obj, true))
> > -                       i915_gem_chipset_flush(obj->base.dev);
> > -       }
> > -
> >         return 0;
> >  }
> >  
> > @@ -3795,6 +3788,10 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >         WARN_ON(obj->pin_display > vma->pin_count);
> >  
> >         i915_gem_object_flush_cpu_write_domain(obj);
> > +       if (obj->cache_dirty) {
> > +               if (i915_gem_clflush_object(obj, true))
> > +                       i915_gem_chipset_flush(obj->base.dev);
> > +       }
> >  
> >         /* It should now be out of any other write domains, and we can update
> >          * the domain values for our changes.
> > 
> > Maybe even
> > 
> > /* Whilst the object was away from the scanout we may have been eliding the coherent
> >  * writes into the CPU cache. However, the moment it has to be read by the display
> >  * engine, those writes into the CPU cache become inaccessible and so we have to 
> >  * clflush them out to main memory. We track elided flushes with obj->cache_dirty
> >  * and hope they are rare.
> >  */
> > 
> > 
> > The other user of set-cache-level (set_caching_ioctl) should not care
> > about the clflush side-effect and the clflush elision should work just
> > fine instead.
> 
> Hmm. So what would happen on !LLC if we start with a cached bo, then pwrite it
> and afterwards make it uncached?

In fact that would still fail even with my patch, and wouldn't work with current
upstream code either. To fix that I'd need to drop the I915_CACHE_NONE/WT checks
from pwrite in my patch and always set cache_dirty=true when it didn't
clflush. Doing that would seem perfectly reasonable to me.

-- 
Ville Syrjälä
Intel OTC


More information about the Intel-gfx mailing list