[Intel-gfx] [PATCH 1/3] drm/i915/gt: Tweak gen7 xcs flushing

Chris Wilson chris at chris-wilson.co.uk
Thu Feb 6 19:27:24 UTC 2020


Quoting Mika Kuoppala (2020-02-06 16:35:12)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> 
> > Don't immediately write the seqno into the breadcrumb slot, but wait
> > until we've attempted to flush the writes; that is we need to ensure the
> > memory is coherent prior to updating the breadcrumb so that any
> > observers who see the new seqno can proceed.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  .../gpu/drm/i915/gt/intel_ring_submission.c   | 24 ++++++++++++-------
> >  1 file changed, 16 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> > index 9537d4912225..42168d7cf5b5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> > @@ -446,31 +446,39 @@ static u32 *gen6_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
> >       return cs;
> >  }
> >  
> > -#define GEN7_XCS_WA 32
> > -static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
> > +#define GEN7_XCS_WA 8
> > +static u32 *
> > +__gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 addr, u32 *cs)
> >  {
> >       int i;
> >  
> > -     GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
> > -     GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
> > -
> >       *cs++ = MI_FLUSH_DW | MI_INVALIDATE_TLB |
> >               MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
> > -     *cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT;
> > +     *cs++ = addr | MI_FLUSH_DW_USE_GTT;
> >       *cs++ = rq->fence.seqno;
> >  
> >       for (i = 0; i < GEN7_XCS_WA; i++) {
> >               *cs++ = MI_STORE_DWORD_INDEX;
> > -             *cs++ = I915_GEM_HWS_SEQNO_ADDR;
> > +             *cs++ = addr;
> >               *cs++ = rq->fence.seqno;
> >       }
> >  
> > +     return cs;
> > +}
> > +
> > +static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
> > +{
> > +     GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
> > +     GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
> > +
> > +     cs = __gen7_xcs_emit_breadcrumb(rq,  I915_GEM_HWS_SEQNO_ADDR + 4, cs);
> 
> One fake for the above before the real thing?

That's what I did later. Doesn't seem to make much difference either
way, but confirmation bias says at least 1 fake is better.
-Chris


More information about the Intel-gfx mailing list