[Intel-gfx] [PATCH 3/3] drm/i915/execlists: Apply a full mb before execution for Braswell
Chris Wilson
chris at chris-wilson.co.uk
Thu Dec 6 21:11:57 UTC 2018
Quoting Tvrtko Ursulin (2018-12-06 13:12:35)
>
> On 06/12/2018 08:44, Chris Wilson wrote:
> > Braswell is really picky about having our writes posted to memory before
> > we execute or else the GPU may see stale values. A wmb() is insufficient
> > as it only ensures the writes are visible to other cores, we need a full
> > mb() to ensure the writes are in memory and visible to the GPU.
> >
> > The most frequent failure in flushing before execution is that we see
> > stale PTE values and execute the wrong pages.
> >
> > References: 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > Cc: stable at vger.kernel.org
> > ---
> > drivers/gpu/drm/i915/intel_lrc.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index de1e9dc6aec0..e6a86fa4502d 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -379,8 +379,13 @@ static u64 execlists_update_context(struct i915_request *rq)
> > * may not be visible to the HW prior to the completion of the UC
> > * register write and that we may begin execution from the context
> > * before its image is complete leading to invalid PD chasing.
> > + *
> > + * Furthermore, Braswell, at least, wants a full mb to be sure that
> > + * the writes are coherent in memory (visible to the GPU) prior to
> > + * execution, and not just visible to other CPUs (as is the result of
> > + * wmb).
> > */
> > - wmb();
> > + mb();
> > return ce->lrc_desc;
> > }
> >
> >
>
> Too low level for me to really know what happens under the hood, but at
> least I know it can't break anything.
The alternative I'm considering is using a mmio read instead. However,
the improvement in stability from switching to mb() here is already
enough to proceed without necessarily finding the ideal solution.
-Chris
More information about the Intel-gfx
mailing list