[PATCH] drm/sched: fix the duplicated TMO message for one IB

Christian König ckoenig.leichtzumerken at gmail.com
Thu May 9 10:29:56 UTC 2019


drm_sched_start() is not necessary called from the timeout handler.

If a soft recovery is sufficient, we just continue without a complete reset.

Christian.

Am 09.05.19 um 12:25 schrieb Liu, Monk:
> Christian
>
> Check "drm_sched_start" which is invoked from gpu_recover() , there is a "drm_sched_start_timeout()" in the tail
>
> /Monk
>
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken at gmail.com>
> Sent: Thursday, May 9, 2019 3:18 PM
> To: Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH] drm/sched: fix the duplicated TMO message for one IB
>
> [CAUTION: External Email]
>
> Am 09.05.19 um 06:31 schrieb Monk Liu:
>> we don't need duplicated IB's timeout error message reported
>> endlessly, just one report per timedout IB is enough
> Well, NAK. We don't need multiple timeout reports, but we really need to restart the timeout counter after handling it.
>
> Otherwise we will never run into a timeout again after handling one and it isn't unlikely that multiple IBs in a row doesn't work correctly.
>
> Christian.
>
>> Signed-off-by: Monk Liu <Monk.Liu at amd.com>
>> ---
>>    drivers/gpu/drm/scheduler/sched_main.c | 5 -----
>>    1 file changed, 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index c1aaf85..d6c17f1 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -308,7 +308,6 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>    {
>>        struct drm_gpu_scheduler *sched;
>>        struct drm_sched_job *job;
>> -     unsigned long flags;
>>
>>        sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>        job = list_first_entry_or_null(&sched->ring_mirror_list,
>> @@ -316,10 +315,6 @@ static void drm_sched_job_timedout(struct
>> work_struct *work)
>>
>>        if (job)
>>                job->sched->ops->timedout_job(job);
>> -
>> -     spin_lock_irqsave(&sched->job_list_lock, flags);
>> -     drm_sched_start_timeout(sched);
>> -     spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>    }
>>
>>     /**
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list