<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> </head> <body> Yes, I think so as well. Andrey can you push this? Christian. <div class="moz-cite-prefix">Am 15.09.21 um 00:59 schrieb Grodzovsky, Andrey: </div> <blockquote type="cite" cite="mid:SN6PR12MB4623F8A1707C870938E08140EADA9@SN6PR12MB4623.namprd12.prod.outlook.com"> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> <div>AFAIK this one is independent.</div> <div> </div> <div>Christian, can you confirm ?</div> <div> </div> <div>Andrey</div> <hr style="display:inline-block;width:98%" tabindex="-1"> <div id="divRplyFwdMsg" dir="ltr">From: amd-gfx <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx-bounces@lists.freedesktop.org"><amd-gfx-bounces@lists.freedesktop.org></a> on behalf of Alex Deucher <a class="moz-txt-link-rfc2396E" href="mailto:alexdeucher@gmail.com"><alexdeucher@gmail.com></a> Sent: 14 September 2021 15:33 To: Christian König <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a> Cc: Liu, Monk <a class="moz-txt-link-rfc2396E" href="mailto:Monk.Liu@amd.com"><Monk.Liu@amd.com></a>; amd-gfx list <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Maling list - DRI developers <a class="moz-txt-link-rfc2396E" href="mailto:dri-devel@lists.freedesktop.org"><dri-devel@lists.freedesktop.org></a> Subject: Re: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4) <div> </div> </div> <div class="BodyFragment"> <div class="PlainText">Was this fix independent of the other discussions? Should this be applied to drm-misc? Alex On Wed, Sep 1, 2021 at 4:42 PM Alex Deucher <a class="moz-txt-link-rfc2396E" href="mailto:alexdeucher@gmail.com"><alexdeucher@gmail.com></a> wrote: > > On Wed, Sep 1, 2021 at 2:50 AM Christian König > <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a> wrote: > > > > Am 01.09.21 um 02:46 schrieb Monk Liu: > > > issue: > > > in cleanup_job the cancle_delayed_work will cancel a TO timer > > > even the its corresponding job is still running. > > > > > > fix: > > > do not cancel the timer in cleanup_job, instead do the cancelling > > > only when the heading job is signaled, and if there is a "next" job > > > we start_timeout again. > > > > > > v2: > > > further cleanup the logic, and do the TDR timer cancelling if the signaled job > > > is the last one in its scheduler. > > > > > > v3: > > > change the issue description > > > remove the cancel_delayed_work in the begining of the cleanup_job > > > recover the implement of drm_sched_job_begin. > > > > > > v4: > > > remove the kthread_should_park() checking in cleanup_job routine, > > > we should cleanup the signaled job asap > > > > > > TODO: > > > 1)introduce pause/resume scheduler in job_timeout to serial the handling > > > of scheduler and job_timeout. > > > 2)drop the bad job's del and insert in scheduler due to above serialization > > > (no race issue anymore with the serialization) > > > > > > tested-by: jingwen <a class="moz-txt-link-rfc2396E" href="mailto:jingwen.chen@@amd.com"><jingwen.chen@@amd.com></a> > > > Signed-off-by: Monk Liu <a class="moz-txt-link-rfc2396E" href="mailto:Monk.Liu@amd.com"><Monk.Liu@amd.com></a> > > > > Reviewed-by: Christian König <a class="moz-txt-link-rfc2396E" href="mailto:christian.koenig@amd.com"><christian.koenig@amd.com></a> > > > > Are you planning to push this to drm-misc? > > Alex > > > > > --- > > > drivers/gpu/drm/scheduler/sched_main.c | 26 +++++++++----------------- > > > 1 file changed, 9 insertions(+), 17 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > > > index a2a9536..3e0bbc7 100644 > > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > > @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) > > > { > > > struct drm_sched_job *job, *next; > > > > > > - /* > > > - * Don't destroy jobs while the timeout worker is running OR thread > > > - * is being parked and hence assumed to not touch pending_list > > > - */ > > > - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && > > > - !cancel_delayed_work(&sched->work_tdr)) || > > > - kthread_should_park()) > > > - return NULL; > > > - > > > spin_lock(&sched->job_list_lock); > > > > > > job = list_first_entry_or_null(&sched->pending_list, > > > @@ -693,17 +684,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) > > > if (job && dma_fence_is_signaled(&job->s_fence->finished)) { > > > /* remove job from pending_list */ > > > list_del_init(&job->list); > > > + > > > + /* cancel this job's TO timer */ > > > + cancel_delayed_work(&sched->work_tdr); > > > /* make the scheduled timestamp more accurate */ > > > next = list_first_entry_or_null(&sched->pending_list, > > > typeof(*next), list); > > > - if (next) > > > + > > > + if (next) { > > > next->s_fence->scheduled.timestamp = > > > job->s_fence->finished.timestamp; > > > - > > > + /* start TO timer for next job */ > > > + drm_sched_start_timeout(sched); > > > + } > > > } else { > > > job = NULL; > > > - /* queue timeout for next job */ > > > - drm_sched_start_timeout(sched); > > > } > > > > > > spin_unlock(&sched->job_list_lock); > > > @@ -791,11 +786,8 @@ static int drm_sched_main(void *param) > > > (entity = drm_sched_select_entity(sched))) || > > > kthread_should_stop()); > > > > > > - if (cleanup_job) { > > > + if (cleanup_job) > > > sched->ops->free_job(cleanup_job); > > > - /* queue timeout for next job */ > > > - drm_sched_start_timeout(sched); > > > - } > > > > > > if (!entity) > > > continue; > > </div> </div> </blockquote> </body> </html>