[RFC][PATCH] locking: Fix __clear_task_blocked_on() warning from __ww_mutex_wound() path

K Prateek Nayak kprateek.nayak at amd.com
Fri Aug 1 05:09:08 UTC 2025


Hello John,

On 8/1/2025 1:43 AM, John Stultz wrote:

[..snip..]

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 40d2fa90df425..a9a78f51f7f57 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2166,15 +2166,16 @@ static inline void set_task_blocked_on(struct task_struct *p, struct mutex *m)
>  
>  static inline void __clear_task_blocked_on(struct task_struct *p, struct mutex *m)
>  {
> -	WARN_ON_ONCE(!m);
> -	/* Currently we serialize blocked_on under the mutex::wait_lock */
> -	lockdep_assert_held_once(&m->wait_lock);
> -	/*
> -	 * There may be cases where we re-clear already cleared
> -	 * blocked_on relationships, but make sure we are not
> -	 * clearing the relationship with a different lock.
> -	 */
> -	WARN_ON_ONCE(m && p->blocked_on && p->blocked_on != m);
> +	if (m) {
> +		/* Currently we serialize blocked_on under the mutex::wait_lock */
> +		lockdep_assert_held_once(&m->wait_lock);
> +		/*
> +		 * There may be cases where we re-clear already cleared
> +		 * blocked_on relationships, but make sure we are not
> +		 * clearing the relationship with a different lock.
> +		 */
> +		WARN_ON_ONCE(m && p->blocked_on && p->blocked_on != m);

Small concern since we don't hold the "owner->blocked_on->wait_lock" here
when arriving from __ww_mutex_wound() as Hillf pointed out, can we run
into a situation like:

              CPU0                                                               CPU1
        (Owner of Mutex A,                                              (Trying to acquire Mutex A)
    trying to acquire Mutex B)
    ==========================                                          ===========================

    __mutex_lock_common(B)
      ... /* B->wait_lock held */
      set_task_blocked_on(ownerA, B)
      if (__mutex_trylock(B)) /* Returns true */                        __mutex_lock_common(A)
        goto acquired; /* Goes to below point */                          ... /* A->wait_lock held */
      __clear_task_blocked_on(ownerA, B);                                 __ww_mutex_wound(ownerA)
        WARN_ON_ONCE(m /* Mutex B*/                                         ...
                     && ownerA->blocked_on /* Mutex B */                    __clear_task_blocked_on(ownerA, NULL)
                     ...                                                      ownerA->blocked_on = NULL;
                     && ownerA->blocked_on /* NULL */ != m /* Mutex B */);
          !!! SPLAT !!!


At the very least I think we should make a local copy of "p->blocked_on"
to see a consistent view throughout __clear_task_blocked_on() - task either
sees it is blocked on the mutex and clear "p->blocked_on", or it sees it is
blocked on nothing and still clears "p->blocked_on".

(Tested lightly with syzbot's C reproducer)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 02c340450469..f35d93cca64f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2165,6 +2165,8 @@ static inline void set_task_blocked_on(struct task_struct *p, struct mutex *m)
 static inline void __clear_task_blocked_on(struct task_struct *p, struct mutex *m)
 {
 	if (m) {
+		struct mutex *blocked_on = p->blocked_on;
+
 		/* Currently we serialize blocked_on under the mutex::wait_lock */
 		lockdep_assert_held_once(&m->wait_lock);
 		/*
@@ -2172,7 +2174,7 @@ static inline void __clear_task_blocked_on(struct task_struct *p, struct mutex *
 		 * blocked_on relationships, but make sure we are not
 		 * clearing the relationship with a different lock.
 		 */
-		WARN_ON_ONCE(m && p->blocked_on && p->blocked_on != m);
+		WARN_ON_ONCE(m && blocked_on && blocked_on != m);
 	}
 	p->blocked_on = NULL;
 }
---

End result is the same, only that we avoid an unnecessary splat in this
very unlikely case and save ourselves some head scratching later :)

Thoughts?

> +	}
>  	p->blocked_on = NULL;
>  }
>  
> diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h
> index 086fd5487ca77..ef8ef3c28592c 100644
> --- a/kernel/locking/ww_mutex.h
> +++ b/kernel/locking/ww_mutex.h
> @@ -342,8 +342,12 @@ static bool __ww_mutex_wound(struct MUTEX *lock,
>  			 * When waking up the task to wound, be sure to clear the
>  			 * blocked_on pointer. Otherwise we can see circular
>  			 * blocked_on relationships that can't resolve.
> +			 *
> +			 * NOTE: We pass NULL here instead of lock, because we
> +			 * are waking the lock owner, who may be currently blocked
> +			 * on a different lock.
>  			 */
> -			__clear_task_blocked_on(owner, lock);
> +			__clear_task_blocked_on(owner, NULL);
>  			wake_q_add(wake_q, owner);
>  		}
>  		return true;

-- 
Thanks and Regards,
Prateek



More information about the dri-devel mailing list