[PATCH] drm/xe: Fix taking invalid lock on wedge

Vivekanandan, Balasubramani balasubramani.vivekanandan at intel.com
Mon Apr 7 10:34:08 UTC 2025


On 02.04.2025 22:38, Lucas De Marchi wrote:
> If device wedges on e.g. GuC upload, the submission is not yet enabled
> and the state is not even initialized. Protect the wedge call so it does
> nothing in this case. It fixes the following splat:
> 
> 	[] xe 0000:bf:00.0: [drm] device wedged, needs recovery
> 	[] ------------[ cut here ]------------
> 	[] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
> 	[] WARNING: CPU: 48 PID: 312 at kernel/locking/mutex.c:564 __mutex_lock+0x8a1/0xe60
> 	...
> 	[] RIP: 0010:__mutex_lock+0x8a1/0xe60
> 	[]  mutex_lock_nested+0x1b/0x30
> 	[]  xe_guc_submit_wedge+0x80/0x2b0 [xe]
> 
> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_guc_submit.c | 9 +++++++++
>  drivers/gpu/drm/xe/xe_guc_types.h  | 5 +++++
>  2 files changed, 14 insertions(+)

Looks good to me.

Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan at intel.com>

Regards,
Bala

> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 31bc2022bfc2d..813c3c0bb2500 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -300,6 +300,8 @@ int xe_guc_submit_init(struct xe_guc *guc, unsigned int num_ids)
>  
>  	primelockdep(guc);
>  
> +	guc->submission_state.initialized = true;
> +
>  	return drmm_add_action_or_reset(&xe->drm, guc_submit_fini, guc);
>  }
>  
> @@ -834,6 +836,13 @@ void xe_guc_submit_wedge(struct xe_guc *guc)
>  
>  	xe_gt_assert(guc_to_gt(guc), guc_to_xe(guc)->wedged.mode);
>  
> +	/*
> +	 * If device is being wedged even before submission_state is
> +	 * initialized, there's nothing to do here.
> +	 */
> +	if (!guc->submission_state.initialized)
> +		return;
> +
>  	err = devm_add_action_or_reset(guc_to_xe(guc)->drm.dev,
>  				       guc_submit_wedged_fini, guc);
>  	if (err) {
> diff --git a/drivers/gpu/drm/xe/xe_guc_types.h b/drivers/gpu/drm/xe/xe_guc_types.h
> index 63bac64429a5d..1fde7614fcc52 100644
> --- a/drivers/gpu/drm/xe/xe_guc_types.h
> +++ b/drivers/gpu/drm/xe/xe_guc_types.h
> @@ -89,6 +89,11 @@ struct xe_guc {
>  		struct mutex lock;
>  		/** @submission_state.enabled: submission is enabled */
>  		bool enabled;
> +		/**
> +		 * @submission_state.initialized: mark when submission state is
> +		 * even initialized - before that not even the lock is valid
> +		 */
> +		bool initialized;
>  		/** @submission_state.fini_wq: submit fini wait queue */
>  		wait_queue_head_t fini_wq;
>  	} submission_state;
> 
> 
> 


More information about the Intel-xe mailing list