[Intel-gfx] [RFC][PATCH] wake_up_var() memory ordering
Peter Zijlstra
peterz at infradead.org
Tue Jun 25 08:11:03 UTC 2019
(sorry for cross-posting to moderated lists btw, I've since
acquired a patch to get_maintainers.pl that wil exclude them
in the future)
On Tue, Jun 25, 2019 at 08:51:01AM +0100, David Howells wrote:
> Peter Zijlstra <peterz at infradead.org> wrote:
>
> > I tried using wake_up_var() today and accidentally noticed that it
> > didn't imply an smp_mb() and specifically requires it through
> > wake_up_bit() / waitqueue_active().
>
> Thinking about it again, I'm not sure why you need to add the barrier when
> wake_up() (which this is a wrapper around) is required to impose a barrier at
> the front if there's anything to wake up (ie. the wait queue isn't empty).
>
> If this is insufficient, does it make sense just to have wake_up*() functions
> do an unconditional release or full barrier right at the front, rather than it
> being conditional on something being woken up?
The curprit is __wake_up_bit()'s usage of waitqueue_active(); it is this
latter (see its comment) that requires the smp_mb().
wake_up_bit() and wake_up_var() are wrappers around __wake_up_bit().
Without this barrier it is possible for the waitqueue_active() load to
be hoisted over the cond=true store and the remote end can miss the
store and we can miss its enqueue and we'll all miss a wakeup and get
stuck.
Adding an smp_mb() (or use wq_has_sleeper()) in __wake_up_bit() would be
nice, but I fear some people will complain about overhead, esp. since
about half the sites don't need the barrier due to being behind
test_and_clear_bit() and the other half using smp_mb__after_atomic()
after some clear_bit*() variant.
There's a few sites that seem to open-code
wait_var_event()/wake_up_var() and those actually need the full
smp_mb(), but then maybe they should be converted to var instread of bit
anyway.
> > @@ -619,9 +614,7 @@ static int dvb_usb_fe_sleep(struct dvb_frontend *fe)
> > err:
> > if (!adap->suspend_resume_active) {
> > adap->active_fe = -1;
>
> I'm wondering if there's a missing barrier here. Should the clear_bit() on
> the next line be clear_bit_unlock() or clear_bit_release()?
That looks reasonable, but I'd like to hear from the DVB folks on that.
> > - clear_bit(ADAP_SLEEP, &adap->state_bits);
> > - smp_mb__after_atomic();
> > - wake_up_bit(&adap->state_bits, ADAP_SLEEP);
> > + clear_and_wake_up_bit(ADAP_SLEEP, &adap->state_bits);
> > }
> >
> > dev_dbg(&d->udev->dev, "%s: ret=%d\n", __func__, ret);
> > diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c
> > index cfe62b154f68..377ee07d5f76 100644
> > --- a/fs/afs/fs_probe.c
> > +++ b/fs/afs/fs_probe.c
> > @@ -18,6 +18,7 @@ static bool afs_fs_probe_done(struct afs_server *server)
> >
> > wake_up_var(&server->probe_outstanding);
> > clear_bit_unlock(AFS_SERVER_FL_PROBING, &server->flags);
> > + smp_mb__after_atomic();
> > wake_up_bit(&server->flags, AFS_SERVER_FL_PROBING);
> > return true;
> > }
>
> Looking at this and the dvb one, does it make sense to stick the release
> semantics of clear_bit_unlock() into clear_and_wake_up_bit()?
I was thinking of adding another helper, maybe unlock_and_wake_up_bit()
that included that extra barrier, but maybe making it unconditional
isn't the worst idea.
> Also, should clear_bit_unlock() be renamed to clear_bit_release() (and
> similarly test_and_set_bit_lock() -> test_and_set_bit_acquire()) if we seem to
> be trying to standardise on that terminology.
That definitely makes sense to me, there's only 157 clear_bit_unlock()
and 76 test_and_set_bit_lock() users (note the asymetry of that).
More information about the Intel-gfx
mailing list