[Intel-gfx] [PATCH 3/3] drm/i915/execlists: Apply a full mb before execution for Braswell

Chris Wilson chris at chris-wilson.co.uk
Thu Dec 6 21:11:57 UTC 2018


Quoting Tvrtko Ursulin (2018-12-06 13:12:35)
> 
> On 06/12/2018 08:44, Chris Wilson wrote:
> > Braswell is really picky about having our writes posted to memory before
> > we execute or else the GPU may see stale values. A wmb() is insufficient
> > as it only ensures the writes are visible to other cores, we need a full
> > mb() to ensure the writes are in memory and visible to the GPU.
> > 
> > The most frequent failure in flushing before execution is that we see
> > stale PTE values and execute the wrong pages.
> > 
> > References: 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > Cc: stable at vger.kernel.org
> > ---
> >   drivers/gpu/drm/i915/intel_lrc.c | 7 ++++++-
> >   1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index de1e9dc6aec0..e6a86fa4502d 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -379,8 +379,13 @@ static u64 execlists_update_context(struct i915_request *rq)
> >        * may not be visible to the HW prior to the completion of the UC
> >        * register write and that we may begin execution from the context
> >        * before its image is complete leading to invalid PD chasing.
> > +      *
> > +      * Furthermore, Braswell, at least, wants a full mb to be sure that
> > +      * the writes are coherent in memory (visible to the GPU) prior to
> > +      * execution, and not just visible to other CPUs (as is the result of
> > +      * wmb).
> >        */
> > -     wmb();
> > +     mb();
> >       return ce->lrc_desc;
> >   }
> >   
> > 
> 
> Too low level for me to really know what happens under the hood, but at 
> least I know it can't break anything.

The alternative I'm considering is using a mmio read instead. However,
the improvement in stability from switching to mb() here is already
enough to proceed without necessarily finding the ideal solution.
-Chris


More information about the Intel-gfx mailing list