[Intel-xe] [RFC PATCH 06/10] drm/sched: Submit job before starting TDR
Luben Tuikov
luben.tuikov at amd.com
Thu May 4 05:23:05 UTC 2023
On 2023-04-03 20:22, Matthew Brost wrote:
> If the TDR is set to a value, it can fire before a job is submitted in
> drm_sched_main. The job should be always be submitted before the TDR
> fires, fix this ordering.
>
> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> ---
> drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 6ae710017024..4eac02d212c1 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1150,10 +1150,10 @@ static void drm_sched_main(struct work_struct *w)
> s_fence = sched_job->s_fence;
>
> atomic_inc(&sched->hw_rq_count);
> - drm_sched_job_begin(sched_job);
>
> trace_drm_run_job(sched_job, entity);
> fence = sched->ops->run_job(sched_job);
> + drm_sched_job_begin(sched_job);
> complete_all(&entity->entity_idle);
> drm_sched_fence_scheduled(s_fence);
>
Not sure if this is correct. In drm_sched_job_begin() we add the job to the "pending_list"
(meaning it is pending execution in the hardware) and we also start a timeout timer. Both
of those should be started before the job is given to the hardware.
If the timeout is set to too small a value, then that should probably be fixed instead.
Regards,
Luben
More information about the Intel-xe
mailing list