[PATCH] drm/xe: Fix taking invalid lock on wedge
Vivekanandan, Balasubramani
balasubramani.vivekanandan at intel.com
Mon Apr 7 10:34:08 UTC 2025
On 02.04.2025 22:38, Lucas De Marchi wrote:
> If device wedges on e.g. GuC upload, the submission is not yet enabled
> and the state is not even initialized. Protect the wedge call so it does
> nothing in this case. It fixes the following splat:
>
> [] xe 0000:bf:00.0: [drm] device wedged, needs recovery
> [] ------------[ cut here ]------------
> [] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
> [] WARNING: CPU: 48 PID: 312 at kernel/locking/mutex.c:564 __mutex_lock+0x8a1/0xe60
> ...
> [] RIP: 0010:__mutex_lock+0x8a1/0xe60
> [] mutex_lock_nested+0x1b/0x30
> [] xe_guc_submit_wedge+0x80/0x2b0 [xe]
>
> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
> ---
> drivers/gpu/drm/xe/xe_guc_submit.c | 9 +++++++++
> drivers/gpu/drm/xe/xe_guc_types.h | 5 +++++
> 2 files changed, 14 insertions(+)
Looks good to me.
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan at intel.com>
Regards,
Bala
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 31bc2022bfc2d..813c3c0bb2500 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -300,6 +300,8 @@ int xe_guc_submit_init(struct xe_guc *guc, unsigned int num_ids)
>
> primelockdep(guc);
>
> + guc->submission_state.initialized = true;
> +
> return drmm_add_action_or_reset(&xe->drm, guc_submit_fini, guc);
> }
>
> @@ -834,6 +836,13 @@ void xe_guc_submit_wedge(struct xe_guc *guc)
>
> xe_gt_assert(guc_to_gt(guc), guc_to_xe(guc)->wedged.mode);
>
> + /*
> + * If device is being wedged even before submission_state is
> + * initialized, there's nothing to do here.
> + */
> + if (!guc->submission_state.initialized)
> + return;
> +
> err = devm_add_action_or_reset(guc_to_xe(guc)->drm.dev,
> guc_submit_wedged_fini, guc);
> if (err) {
> diff --git a/drivers/gpu/drm/xe/xe_guc_types.h b/drivers/gpu/drm/xe/xe_guc_types.h
> index 63bac64429a5d..1fde7614fcc52 100644
> --- a/drivers/gpu/drm/xe/xe_guc_types.h
> +++ b/drivers/gpu/drm/xe/xe_guc_types.h
> @@ -89,6 +89,11 @@ struct xe_guc {
> struct mutex lock;
> /** @submission_state.enabled: submission is enabled */
> bool enabled;
> + /**
> + * @submission_state.initialized: mark when submission state is
> + * even initialized - before that not even the lock is valid
> + */
> + bool initialized;
> /** @submission_state.fini_wq: submit fini wait queue */
> wait_queue_head_t fini_wq;
> } submission_state;
>
>
>
More information about the Intel-xe
mailing list