[Intel-gfx] [PATCH 5/9] drm/i915/execlists: Unify CSB access pointers

Chris Wilson chris at chris-wilson.co.uk
Thu Jun 28 11:05:36 UTC 2018


Quoting Tvrtko Ursulin (2018-06-28 11:58:26)
> On 28/06/2018 11:10, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-06-28 11:02:17)
> >>
> >> On 27/06/2018 22:07, Chris Wilson wrote:
> >>> +     GEM_TRACE("%s cs-irq head=%d, tail=%d\n", engine->name, head, tail);
> >>> +     if (unlikely(head == tail))
> >>> +             return;
> >>>    
> >>> -             head = readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
> >>> -             tail = GEN8_CSB_WRITE_PTR(head);
> >>> -             head = GEN8_CSB_READ_PTR(head);
> >>> -             execlists->csb_head = head;
> >>> -     } else {
> >>> -             const int write_idx =
> >>> -                     intel_hws_csb_write_index(i915) -
> >>> -                     I915_HWS_CSB_BUF0_INDEX;
> >>> +     rmb(); /* Hopefully paired with a wmb() in HW */
> >>>    
> >>> -             head = execlists->csb_head;
> >>> -             tail = READ_ONCE(buf[write_idx]);
> >>> -             rmb(); /* Hopefully paired with a wmb() in HW */
> >>> -     }
> >>> -     GEM_TRACE("%s cs-irq head=%d [%d%s], tail=%d [%d%s]\n",
> >>> -               engine->name,
> >>> -               head, GEN8_CSB_READ_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?",
> >>> -               tail, GEN8_CSB_WRITE_PTR(readl(i915->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)))), fw ? "" : "?");
> >>> -
> >>> -     while (head != tail) {
> >>> +     do {
> >>
> >> Why convert while to do-while? Early unlikely return above handles the
> >> void process_csb calls? Would the same effect happen if you put unlikely
> >> in the while (head != tail) condition and would be simpler?
> > 
> > The earlier return to circumvent the lfence.
> 
> Oh that one.. so why it is safe to compare head and tail before the rmb 
> since we need the rmb afterwards?

There's a direct data dependency in the control flow of using the
volatile read from HWSP, but later on there is no data dependencies
between the write pointer we've "already" used and the rest of the CSB[]
we want to pull from HWSP. So while it seems unlikely, without that
lfence those later reads could be performed before the test (but the
test itself can't be performed before the read!). And sadly we can't get
away with our older read_barrier(), which for a long time I thought was
sufficient to order the read of the write pointer before the reads of
CSB[].
-Chris


More information about the Intel-gfx mailing list