[PATCH V3] drm/xe: cancel pending job timer before freeing scheduler
Matthew Brost
matthew.brost at intel.com
Tue Feb 25 05:33:17 UTC 2025
On Tue, Feb 25, 2025 at 10:27:54AM +0530, Tejas Upadhyay wrote:
> Async call to __guc_exec_queue_fini_async frees scheduler
> at the same time when some scheduler submission would have
> timed out and restarted. To handle such small window race
> case, all pending jobs timer should be cancelled before
> freeing scheduler.
>
'The async call to __guc_exec_queue_fini_async frees the scheduler while
a submission may time out and restart. To prevent this race condition,
the pending job timer should be canceled before freeing the scheduler.'
> V3(MattB):
> - Adjust position of cancel pending job
> - Remove gitlab issue# from commit message
> V2(MattB):
> - Cancel pending jobs before scheduler finish
>
Fixes tag?
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
With above:
Reviewed-by: Matthew Brost <matthew.brost at intel.com>
> ---
> drivers/gpu/drm/xe/xe_guc_submit.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 913c74d6e2ae..b6a2dd742ebd 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -1248,6 +1248,8 @@ static void __guc_exec_queue_fini_async(struct work_struct *w)
>
> if (xe_exec_queue_is_lr(q))
> cancel_work_sync(&ge->lr_tdr);
> + /* Confirm no work left behind accessing device structures */
> + cancel_delayed_work_sync(&ge->sched.base.work_tdr);
> release_guc_id(guc, q);
> xe_sched_entity_fini(&ge->entity);
> xe_sched_fini(&ge->sched);
> --
> 2.34.1
>
More information about the Intel-xe
mailing list