[Intel-gfx] [PATCH 21/32] drm/i915: Broadwell execlists needs exactly the same seqno w/a as legacy

Tue Jan 5 02:20:13 PST 2016

On Mon, Jan 04, 2016 at 01:34:46PM -0800, Jesse Barnes wrote:
> On 12/11/2015 03:33 AM, Chris Wilson wrote:
> > +	 * Note that this effectively effectively stalls the read by the time
> > +	 * it takes to do a memory transaction, which more or less ensures
> > +	 * that the write from the GPU has sufficient time to invalidate
> > +	 * the CPU cacheline. Alternatively we could delay the interrupt from
> > +	 * the CS ring to give the write time to land, but that would incur
> > +	 * a delay after every batch i.e. much more frequent than a delay
> > +	 * when waiting for the interrupt (with the same net latency).
> >  	 */
> > +	struct drm_i915_private *dev_priv = ring->i915;
> > +	POSTING_READ_FW(RING_ACTHD(ring->mmio_base));
> > +
> >  	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
> 
> Funnily enough, the interrupt ought to provide the same behavior as the MMIO read, i.e. flush outstanding system memory writes ahead of it.  The fact that we need it *plus* a CPU cache flush definitely means we're still missing something...

It is purely a timing issue (aside from bxt-a requiring the w/a). A
udelay() works just as well, but the question is what value to wait for,
which is where the above empiricism kicks in. (Another example would be
adding 32 dword writes into the ring between the seqno and
MI_USER_INTERRUPT).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre