[Intel-gfx] [passive aggressive RESEND 2/2] drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty

Chris Wilson chris at chris-wilson.co.uk
Fri Jun 16 10:04:34 UTC 2017


Quoting Tvrtko Ursulin (2017-06-15 15:41:08)
> 
> On 15/06/2017 13:38, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 2a9aed5640e2..20933a15be46 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -1110,7 +1110,7 @@ eb_move_to_gpu(struct i915_execbuffer *eb)
> >               if (vma->exec_entry->flags & EXEC_OBJECT_ASYNC)
> >                       continue;
> >   
> > -             if (obj->cache_dirty)
> > +             if (obj->cache_dirty & ~obj->cache_coherent)
> 
> What is the explanation for this change?

From above:

> > For ease of use (i.e. avoiding a few checks and function calls), store
> > the object's cache coherency next to the cache is dirty bit.

It's a single load for cache_dirty + cache_coherent, but we add a new
reg mov and shift, a couple of instructions to save a function call and
a fresh load.

> 
> >                       i915_gem_clflush_object(obj, 0);
> >   
> >               ret = i915_gem_request_await_object
> > diff --git a/drivers/gpu/drm/i915/selftests/huge_gem_object.c b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
> > index 0ca867a877b6..caf76af36aba 100644
> > --- a/drivers/gpu/drm/i915/selftests/huge_gem_object.c
> > +++ b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
> > @@ -129,7 +129,8 @@ huge_gem_object(struct drm_i915_private *i915,
> >       obj->base.read_domains = I915_GEM_DOMAIN_CPU;
> >       obj->base.write_domain = I915_GEM_DOMAIN_CPU;
> >       obj->cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
> > -     obj->cache_dirty = !i915_gem_object_is_coherent(obj);
> > +     obj->cache_coherent = i915_gem_object_is_coherent(obj);
> > +     obj->cache_dirty = !obj->cache_coherent;
> >       obj->scratch = phys_size;
> >   
> >       return obj;
> > 
> 
> Option of converting i915_gem_object_is_coherent to just return 
> obj->cache_coherent for less churn? (And adding 
> i915_gem_object_set_coherent or something if enough call sites to justify?)

My starting point was doing the bitops in execbuf, so I haven't
considered that alternative. However,

static inline bool i915_gem_object_is_coherent(struct drm_i915_gem_object *obj)
{
	return obj->cache_coherent;
}

might give Joonas a fit before his vacation :)
-Chris


More information about the Intel-gfx mailing list