[Intel-gfx] [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard'

Gupta, Sourab sourab.gupta at intel.com
Fri Mar 21 17:50:18 CET 2014


On Fri, 2014-03-21 at 14:58 +0000, Daniel Vetter wrote:
> On Fri, Mar 21, 2014 at 11:53:40AM +0000, Gupta, Sourab wrote:
> > On Wed, 2014-01-22 at 11:11 +0000, Chris Wilson wrote:
> > > On Wed, Jan 22, 2014 at 12:54:51PM +0200, Ville Syrjälä wrote:
> > > > On Wed, Jan 22, 2014 at 09:15:06AM +0530, akash.goel at intel.com wrote:
> > > > > From: Akash Goel <akash.goel at intel.com>
> > > > >
> > > > > Added a new rendering specific Workaround 'WaReadAfterWriteHazard'.
> > > > > In this WA, need to add 12 MI Store Dword commands to ensure proper
> > > > > flush of h/w pipeline.
> > > > >
> > > > > Signed-off-by: Akash Goel <akash.goel at intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 25 +++++++++++++++++++++++++
> > > > >  1 file changed, 25 insertions(+)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > index 133d273..e8ec536 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > @@ -2167,6 +2167,31 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
> > > > >
> > > > >     trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
> > > > >
> > > > > +   if (IS_VALLEYVIEW(ring->dev)) {
> > > > > +           /*
> > > > > +            * WaReadAfterWriteHazard
> > > > > +            * Send a number of Store Data commands here to finish
> > > > > +            * flushing hardware pipeline.This is needed in the case
> > > > > +            * where the next workload tries reading from the same
> > > > > +            * surface that this batch writes to. Without these StoreDWs,
> > > > > +            * not all of the data will actually be flushd to the surface
> > > > > +            * by the time the next batch starts reading it, possibly
> > > > > +            * causing a small amount of corruption.
> > > > > +            */
> > > > > +           int i;
> > > > > +           ret = intel_ring_begin(ring, 4 * 12);
> > > >
> > > > BSpec says 8 is enough. Is Bspec incorrect.
> > > 
> > > No, these are just figures they plucked out of the air. Last I heard
> > > they were using 32...
> > > 
> > > > Also this workaround is also listed for everything SNB+.
> > > 
> > > And we already have a more effective workaround.
> > > -Chris
> > 
> > Can you please let us know whether this workaround patch is required or
> > not. If not, then how is this currently handled.
> 
> We emit XY_SETUP_BLT before certain blt operations to insert a
> sufficiently long stall. The underlying bug this works around is that the
> cache controller of the cpu falls over in certain very specific rwm
> cycles. The official w/a is a full pipeline flush when switching between
> reading and writing to a surface, which has a horribly perf impact on the
> blitter.
> -Daniel

Hi Daniel,
In the kernel code, we're not able to see any such operation during the
batchbuffer submission. Is the XY_SETUP_BLT emit done through the
userspace, in these specific usecases?
Does this mean that this patch is not admissible as such for this
underlying bug?

Regards,
Sourab


More information about the Intel-gfx mailing list