[PATCH v2] locking: Fix __clear_task_blocked_on() warning from __ww_mutex_wound() path
K Prateek Nayak
kprateek.nayak at amd.com
Mon Aug 4 04:42:50 UTC 2025
Hello John,
On 8/2/2025 12:51 AM, John Stultz wrote:
> The __clear_task_blocked_on() helper added a number of sanity
> checks ensuring we hold the mutex wait lock and that the task
> we are clearing blocked_on pointer (if set) matches the mutex.
>
> However, there is an edge case in the _ww_mutex_wound() logic
> where we need to clear the blocked_on pointer for the task that
> owns the mutex, not the task that is waiting on the mutex.
>
> For this case the sanity checks aren't valid, so handle this
> by allowing a NULL lock to skip the additional checks.
>
> This was easier to miss, I realized, as the test-ww_mutex
> driver only exercises the wait-die class of ww_mutexes.
>
> I've got a follow up patch to extend the test so that it
> will exercise both.
>
> Fixes: a4f0b6fef4b0 ("locking/mutex: Add p->blocked_on wrappers for correctness checks")
> Reported-by: syzbot+602c4720aed62576cd79 at syzkaller.appspotmail.com
> Reported-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> Closes: https://lore.kernel.org/lkml/68894443.a00a0220.26d0e1.0015.GAE@google.com/
> Signed-off-by: John Stultz <jstultz at google.com>
I've been running this for a while and haven't seen any splats with
syzbot's C reproducer.
> ---
> v2:
> * Rewording of "lock" to "mutex" in commit and comment for
> clarity
> * Rework __clear_task_blocked_on() to use READ_ONCE and WRITE_ONCE
> so we don't trip over the WARNING if two instances race, as suggested
> by K Prateek Nayak and Maarten Lankhorst
>
> Cc: Ingo Molnar <mingo at redhat.com>
> Cc: Peter Zijlstra <peterz at infradead.org>
> Cc: Juri Lelli <juri.lelli at redhat.com>
> Cc: Vincent Guittot <vincent.guittot at linaro.org>
> Cc: Dietmar Eggemann <dietmar.eggemann at arm.com>
> Cc: Valentin Schneider <valentin.schneider at arm.com>
> Cc: K Prateek Nayak <kprateek.nayak at amd.com>
> Cc: Suleiman Souhlal <suleiman at google.com>
> Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> Cc: airlied at gmail.com
> Cc: mripard at kernel.org
> Cc: simona at ffwll.ch
> Cc: tzimmermann at suse.de
> Cc: dri-devel at lists.freedesktop.org
> Cc: kernel-team at android.com
> ---
> include/linux/sched.h | 23 +++++++++++++----------
> kernel/locking/ww_mutex.h | 6 +++++-
> 2 files changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 40d2fa90df425..700b50d29f7fe 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2166,16 +2166,19 @@ static inline void set_task_blocked_on(struct task_struct *p, struct mutex *m)
Should we consider using WRITE_ONCE() in __set_task_blocked_on() and
use a local copy of "blocked_on" there too?
I think a set_task_blocked_on() on a separate ww_mutex can still race
with a wound on the ww_ctx which indiscriminately writes NULL to
"owner->blocked_on" and can possibly lead to a splat for:
WARN_ON_ONCE(p->blocked_on && p->blocked_on != m);
^^^^^^^^^^^^^ ^^^^^^^^^^^^^
Sees p is blocked on "m" Turns NULL as a result
already. of a concurrent wound.
A READ_ONCE() and WRITE_ONCE() in __set_task_blocked_on() should help
solve the splat in this very unlikely case too unless I'm mistaken.
Apart from that, this fix looks good. Feel free to include:
Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak at amd.com>
--
Thanks and Regards,
Prateek
>
> static inline void __clear_task_blocked_on(struct task_struct *p, struct mutex *m)
> {
> - WARN_ON_ONCE(!m);
> - /* Currently we serialize blocked_on under the mutex::wait_lock */
> - lockdep_assert_held_once(&m->wait_lock);
> - /*
> - * There may be cases where we re-clear already cleared
> - * blocked_on relationships, but make sure we are not
> - * clearing the relationship with a different lock.
> - */
> - WARN_ON_ONCE(m && p->blocked_on && p->blocked_on != m);
> - p->blocked_on = NULL;
> + if (m) {
> + struct mutex *blocked_on = READ_ONCE(p->blocked_on);
> +
> + /* Currently we serialize blocked_on under the mutex::wait_lock */
> + lockdep_assert_held_once(&m->wait_lock);
> + /*
> + * There may be cases where we re-clear already cleared
> + * blocked_on relationships, but make sure we are not
> + * clearing the relationship with a different lock.
> + */
> + WARN_ON_ONCE(blocked_on && blocked_on != m);
> + }
> + WRITE_ONCE(p->blocked_on, NULL);
> }
More information about the dri-devel
mailing list