[Intel-gfx] [PATCH] drm/i915/perf: don't read head/tail pointers outside critical section

Dixit, Ashutosh ashutosh.dixit at intel.com
Mon Mar 30 15:55:32 UTC 2020


On Mon, 30 Mar 2020 03:09:20 -0700, Chris Wilson wrote:
>
> Quoting Lionel Landwerlin (2020-03-30 10:14:11)
> > Reading or writing those fields should only happen under
> > stream->oa_buffer.ptr_lock.
>
> Writing, yes. Reading as a pair, sure. There are other ways you can
> ensure that the tail/head are read as one, but fair enough.

Sorry but I am trying to understand exactly what the purpose of
stream->oa_buffer.ptr_lock is? This is a classic ring buffer producer
consumer situation where producer updates tail and consumer updates
head. Since both are u32's can't those operations be done without requiring
a lock?

>
> > Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
> > Fixes: d1df41eb72ef ("drm/i915/perf: rework aging tail workaround")
> > ---
> >  drivers/gpu/drm/i915/i915_perf.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> > index c74ebac50015..ec9421f02ebd 100644
> > --- a/drivers/gpu/drm/i915/i915_perf.c
> > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > @@ -463,6 +463,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> >         u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma);
> >         int report_size = stream->oa_buffer.format_size;
> >         unsigned long flags;
> > +       bool pollin;
> >         u32 hw_tail;
> >         u64 now;
> >
> > @@ -532,10 +533,13 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
> >                 stream->oa_buffer.aging_timestamp = now;
> >         }
> >
> > +       pollin = OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
> > +                         stream->oa_buffer.head - gtt_offset) >= report_size;
> > +
> > +
>
> Bonus \n
>
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
>
> >         spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);
> >
> > -       return OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
> > -                       stream->oa_buffer.head - gtt_offset) >= report_size;
> > +       return pollin;

In what way is the original code incorrect? As I mentioned head is u32 and
can be read atomically without requiring a lock? We had deliberately moved
this code outside the lock so as to pick up the the latest value of head if
it had been updated in the consumer (read).


More information about the Intel-gfx mailing list