[Intel-gfx] [PATCH] drm/i915: Unconditionally flush writes before execbuffer

Chris Wilson chris at chris-wilson.co.uk
Thu May 21 06:13:01 PDT 2015


On Thu, May 21, 2015 at 03:07:54PM +0200, Daniel Vetter wrote:
> On Thu, May 21, 2015 at 02:00:34PM +0100, Chris Wilson wrote:
> > On Tue, May 19, 2015 at 03:41:48PM +0100, Chris Wilson wrote:
> > > On Mon, May 11, 2015 at 04:25:52PM +0100, Chris Wilson wrote:
> > > > On Mon, May 11, 2015 at 12:34:37PM +0200, Daniel Vetter wrote:
> > > > > On Mon, May 11, 2015 at 08:51:36AM +0100, Chris Wilson wrote:
> > > > > > With the advent of mmap(wc), we have a path to write directly into
> > > > > > active GPU buffers. When combined with async updates (i.e. avoiding the
> > > > > > explicit domain management along with the memory barriers and GPU
> > > > > > stalls) we start to see the GPU read the wrong values from memory - i.e.
> > > > > > we have insufficient memory barriers along the execbuffer path. Writes
> > > > > > through the GTT should have been naturally serialised with execution
> > > > > > through the GTT as well and so the impact only seems to be from the WC
> > > > > > paths.
> > > > > > 
> > > > > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > > > > Cc: Akash Goel <akash.goel at intel.com>
> > > > > > Cc: stable at vger.kernel.org
> > > > > 
> > > > > Do we have a nasty igt for this? Bugzilla?
> > > > 
> > > > I've added igt/gem_streaming_writes.
> > > > 
> > > > That wmb() is not enough for !llc. Since the wmb() made piglit happy it
> > > > is quite possible I haven't hit the same path exactly, but it's going to
> > > > take some investigation to see if igt/gem_streaming_writes can possibly
> > > > work on !llc.
> > > 
> > > Humbug.
> > > 
> > > Found the bug in gem_streaming_writes, even though I still think the
> > > wmb() is strictly required, it runs fine without (presumably I haven't
> > > managed to avoid all barriers in the execbuffer path yet). However, I
> > > think can improve the stress by inserting extra gpu load -- that should
> > > help make the CPU writes / GPU reads of the buffer concurrent?
> > 
> > Just a small update. I haven't found a way to reproduce this in igt yet,
> > but I can still observe the effect using vbo-map-unsync and the fix
> > there is the above patch to make the wmb() unconditional.
> > 
> > We need to put this into stable@ reasonably quickly (I suspect some of
> > the 4.0 mmap(wc) regressions are due to this as well).
> 
> What about
> 
> 	if (flush_domains & (GTT | CPU))
> 		wmb();
> 
> instead? That would imo explain things a lot better, since cpu wc is
> treated as if in the CPU domains. Hm, looking at the igt that's not quite
> the case, we still put it into the GTT domain for wc mmaps afaict.

No. flush_domains is 0. We are talking about async writes which means
that userspace is not telling the kernel about susbsequent writes into
the inactive portions of the bo, and trusting that the buffer is
coherent and the writes are flushed. Putting the wmb() in the kernel is
not the only solution, but the most convenient (and allows us to just
emit one wmb() - but given the large number of other potential barriers
in this path, I am surprised that is required. Empirical evidence to the
contrary!)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list