[Intel-xe] [PATCH v4 07/10] drm/sched: Start submission before TDR in drm_sched_start

Luben Tuikov luben.tuikov at amd.com
Fri Sep 29 21:53:10 UTC 2023


Hi,

On 2023-09-19 01:01, Matthew Brost wrote:
> If the TDR is set to a very small value it can fire before the
> submission is started in the function drm_sched_start. The submission is
> expected to running when the TDR fires, fix this ordering so this
> expectation is always met.
> 
> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 09ef07b9e9d5..a5cc9b6c2faa 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -684,10 +684,10 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>  			drm_sched_job_done(s_job, -ECANCELED);
>  	}
>  
> +	drm_sched_submit_start(sched);
> +
>  	if (full_recovery)
>  		drm_sched_start_timeout_unlocked(sched);
> -
> -	drm_sched_submit_start(sched);
>  }
>  EXPORT_SYMBOL(drm_sched_start);

No.

A timeout timer should be started before we submit anything down to the hardware.
See Message-ID: <ed3aca10-8a9f-4698-92f4-21558fa6cfe3 at amd.com>,
and Message-ID: <8e5eab14-9e55-42c9-b6ea-02fcc591266d at amd.com>.

You shouldn't start TDR at an arbitrarily late time after job
submission to the hardware. To close this, the timer is started
before jobs are submitted to the hardware.

One possibility is to increase the timeout timer value.
-- 
Regards,
Luben



More information about the Intel-xe mailing list