[Intel-gfx] [PATCH] drm/i915/perf: don't read head/tail pointers outside critical section

Tue Mar 31 01:40:20 UTC 2020

On Mon, 30 Mar 2020 09:38:23 -0700, Chris Wilson wrote:
>
> Quoting Dixit, Ashutosh (2020-03-30 16:55:32)
> > On Mon, 30 Mar 2020 03:09:20 -0700, Chris Wilson wrote:
> > >
> > > Quoting Lionel Landwerlin (2020-03-30 10:14:11)
> > > > Reading or writing those fields should only happen under
> > > > stream->oa_buffer.ptr_lock.
> > >
> > > Writing, yes. Reading as a pair, sure. There are other ways you can
> > > ensure that the tail/head are read as one, but fair enough.
> >
> > Sorry but I am trying to understand exactly what the purpose of
> > stream->oa_buffer.ptr_lock is? This is a classic ring buffer producer
> > consumer situation where producer updates tail and consumer updates
> > head. Since both are u32's can't those operations be done without requiring
> > a lock?
>
> > > >         spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
> > > >
> > > > -       return OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
> > > > -                       stream->oa_buffer.head - gtt_offset) >= report_size;
> > > > +       return pollin;
> >
> > In what way is the original code incorrect? As I mentioned head is u32 and
> > can be read atomically without requiring a lock? We had deliberately moved
> > this code outside the lock so as to pick up the the latest value of head if
> > it had been updated in the consumer (read).
>
> It's the pair of reads here. What's the synchronisation between the read
> of tail/head with the update? There's no sync between the reads so
> order is not determined here.
>
> So we may see the head updated for an old tail, and so think we have
> plenty to report, when in fact there's none (or someother convolution).
>
> Normal ringbuffer is to sample the head/tail pointers, smp_rmb(), then
> consume the data between head/tail (with the write doing the smp_wmb()
> after updating the data and before moving the tail). [So the normal
> usage of barriers is around access to one of tail/head (the other is
> under your control) and the shared contents.]

Ok, thanks for explanantion Chris, I think I understand how barriers are
used with ring buffers but I am still not sure if the previous code was
incorrect. It is almost consumer side code running in the producer
thread. Anyway, let's just go with this patch for now:

Acked-by: Ashutosh Dixit <ashutosh.dixit at intel.com>