[Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

Wed Apr 15 10:31:53 PDT 2015

On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:
> Hi Daniel,
> 
> On 04/15/2015 03:17 AM, Daniel Vetter wrote:
> > This was a bit too much cargo-culted, so lets make it solid:
> > - vblank->count doesn't need to be an atomic, writes are always done
> >   under the protection of dev->vblank_time_lock. Switch to an unsigned
> >   long instead and update comments. Note that atomic_read is just a
> >   normal read of a volatile variable, so no need to audit all the
> >   read-side access specifically.
> > 
> > - The barriers for the vblank counter seqlock weren't complete: The
> >   read-side was missing the first barrier between the counter read and
> >   the timestamp read, it only had a barrier between the ts and the
> >   counter read. We need both.
> > 
> > - Barriers weren't properly documented. Since barriers only work if
> >   you have them on boths sides of the transaction it's prudent to
> >   reference where the other side is. To avoid duplicating the
> >   write-side comment 3 times extract a little store_vblank() helper.
> >   In that helper also assert that we do indeed hold
> >   dev->vblank_time_lock, since in some cases the lock is acquired a
> >   few functions up in the callchain.
> > 
> > Spotted while reviewing a patch from Chris Wilson to add a fastpath to
> > the vblank_wait ioctl.
> > 
> > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mario Kleiner <mario.kleiner.de at gmail.com>
> > Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > Cc: Michel Dänzer <michel at daenzer.net>
> > Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> > ---
> >  drivers/gpu/drm/drm_irq.c | 92 ++++++++++++++++++++++++-----------------------
> >  include/drm/drmP.h        |  8 +++--
> >  2 files changed, 54 insertions(+), 46 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> > index c8a34476570a..23bfbc61a494 100644
> > --- a/drivers/gpu/drm/drm_irq.c
> > +++ b/drivers/gpu/drm/drm_irq.c
> > @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600);
> >  module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600);
> >  module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600);
> >  
> > +static void store_vblank(struct drm_device *dev, int crtc,
> > +			 unsigned vblank_count_inc,
> > +			 struct timeval *t_vblank)
> > +{
> > +	struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
> > +	u32 tslot;
> > +
> > +	assert_spin_locked(&dev->vblank_time_lock);
> > +
> > +	if (t_vblank) {
> > +		tslot = vblank->count + vblank_count_inc;
> > +		vblanktimestamp(dev, crtc, tslot) = *t_vblank;
> > +	}
> > +
> > +	/*
> > +	 * vblank timestamp updates are protected on the write side with
> > +	 * vblank_time_lock, but on the read side done locklessly using a
> > +	 * sequence-lock on the vblank counter. Ensure correct ordering using
> > +	 * memory barrriers. We need the barrier both before and also after the
> > +	 * counter update to synchronize with the next timestamp write.
> > +	 * The read-side barriers for this are in drm_vblank_count_and_time.
> > +	 */
> > +	smp_wmb();
> > +	vblank->count += vblank_count_inc;
> > +	smp_wmb();
> 
> The comment and the code are each self-contradictory.
> 
> If vblank->count writes are always protected by vblank_time_lock (something I
> did not verify but that the comment above asserts), then the trailing write
> barrier is not required (and the assertion that it is in the comment is incorrect).
> 
> A spin unlock operation is always a write barrier.

Hm yeah. Otoh to me that's bordering on "code too clever for my own good".
That the spinlock is held I can assure. That no one goes around and does
multiple vblank updates (because somehow that code raced with the hw
itself) I can't easily assure with a simple assert or something similar.
It's not the case right now, but that can changes.

Also it's not contradictory here, since you'd need to audit all the
callers to be able to make the claim that the 2nd smp_wmb() is redundant.
I'll just add a comment about this.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch