<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body>
Yes, I think so as well. Andrey can you push this?<br>
<br>
Christian.<br>
<br>
<div class="moz-cite-prefix">Am 15.09.21 um 00:59 schrieb
Grodzovsky, Andrey:<br>
</div>
<blockquote type="cite"
cite="mid:SN6PR12MB4623F8A1707C870938E08140EADA9@SN6PR12MB4623.namprd12.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div>AFAIK this one is independent.</div>
<div><br>
</div>
<div>Christian, can you confirm ?</div>
<div><br>
</div>
<div>Andrey</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
face="Calibri, sans-serif" color="#000000"><b>From:</b>
amd-gfx <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx-bounces@lists.freedesktop.org"><amd-gfx-bounces@lists.freedesktop.org></a> on
behalf of Alex Deucher <a class="moz-txt-link-rfc2396E" href="mailto:alexdeucher@gmail.com"><alexdeucher@gmail.com></a><br>
<b>Sent:</b> 14 September 2021 15:33<br>
<b>To:</b> Christian König
<a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a><br>
<b>Cc:</b> Liu, Monk <a class="moz-txt-link-rfc2396E" href="mailto:Monk.Liu@amd.com"><Monk.Liu@amd.com></a>; amd-gfx list
<a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Maling list - DRI
developers <a class="moz-txt-link-rfc2396E" href="mailto:dri-devel@lists.freedesktop.org"><dri-devel@lists.freedesktop.org></a><br>
<b>Subject:</b> Re: [PATCH 1/2] drm/sched: fix the bug of time
out calculation(v4)</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span
style="font-size:11pt;">
<div class="PlainText">Was this fix independent of the other
discussions? Should this be<br>
applied to drm-misc?<br>
<br>
Alex<br>
<br>
On Wed, Sep 1, 2021 at 4:42 PM Alex Deucher
<a class="moz-txt-link-rfc2396E" href="mailto:alexdeucher@gmail.com"><alexdeucher@gmail.com></a> wrote:<br>
><br>
> On Wed, Sep 1, 2021 at 2:50 AM Christian König<br>
> <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a> wrote:<br>
> ><br>
> > Am 01.09.21 um 02:46 schrieb Monk Liu:<br>
> > > issue:<br>
> > > in cleanup_job the cancle_delayed_work will
cancel a TO timer<br>
> > > even the its corresponding job is still
running.<br>
> > ><br>
> > > fix:<br>
> > > do not cancel the timer in cleanup_job,
instead do the cancelling<br>
> > > only when the heading job is signaled, and
if there is a "next" job<br>
> > > we start_timeout again.<br>
> > ><br>
> > > v2:<br>
> > > further cleanup the logic, and do the TDR
timer cancelling if the signaled job<br>
> > > is the last one in its scheduler.<br>
> > ><br>
> > > v3:<br>
> > > change the issue description<br>
> > > remove the cancel_delayed_work in the
begining of the cleanup_job<br>
> > > recover the implement of
drm_sched_job_begin.<br>
> > ><br>
> > > v4:<br>
> > > remove the kthread_should_park() checking
in cleanup_job routine,<br>
> > > we should cleanup the signaled job asap<br>
> > ><br>
> > > TODO:<br>
> > > 1)introduce pause/resume scheduler in
job_timeout to serial the handling<br>
> > > of scheduler and job_timeout.<br>
> > > 2)drop the bad job's del and insert in
scheduler due to above serialization<br>
> > > (no race issue anymore with the
serialization)<br>
> > ><br>
> > > tested-by: jingwen
<a class="moz-txt-link-rfc2396E" href="mailto:jingwen.chen@@amd.com"><jingwen.chen@@amd.com></a><br>
> > > Signed-off-by: Monk Liu
<a class="moz-txt-link-rfc2396E" href="mailto:Monk.Liu@amd.com"><Monk.Liu@amd.com></a><br>
> ><br>
> > Reviewed-by: Christian König
<a class="moz-txt-link-rfc2396E" href="mailto:christian.koenig@amd.com"><christian.koenig@amd.com></a><br>
> ><br>
><br>
> Are you planning to push this to drm-misc?<br>
><br>
> Alex<br>
><br>
><br>
> > > ---<br>
> > > drivers/gpu/drm/scheduler/sched_main.c |
26 +++++++++-----------------<br>
> > > 1 file changed, 9 insertions(+), 17
deletions(-)<br>
> > ><br>
> > > diff --git
a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c<br>
> > > index a2a9536..3e0bbc7 100644<br>
> > > ---
a/drivers/gpu/drm/scheduler/sched_main.c<br>
> > > +++
b/drivers/gpu/drm/scheduler/sched_main.c<br>
> > > @@ -676,15 +676,6 @@
drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)<br>
> > > {<br>
> > > struct drm_sched_job *job, *next;<br>
> > ><br>
> > > - /*<br>
> > > - * Don't destroy jobs while the
timeout worker is running OR thread<br>
> > > - * is being parked and hence assumed
to not touch pending_list<br>
> > > - */<br>
> > > - if ((sched->timeout !=
MAX_SCHEDULE_TIMEOUT &&<br>
> > > -
!cancel_delayed_work(&sched->work_tdr)) ||<br>
> > > - kthread_should_park())<br>
> > > - return NULL;<br>
> > > -<br>
> > >
spin_lock(&sched->job_list_lock);<br>
> > ><br>
> > > job =
list_first_entry_or_null(&sched->pending_list,<br>
> > > @@ -693,17 +684,21 @@
drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)<br>
> > > if (job &&
dma_fence_is_signaled(&job->s_fence->finished))
{<br>
> > > /* remove job from
pending_list */<br>
> > >
list_del_init(&job->list);<br>
> > > +<br>
> > > + /* cancel this job's TO timer
*/<br>
> > > +
cancel_delayed_work(&sched->work_tdr);<br>
> > > /* make the scheduled
timestamp more accurate */<br>
> > > next =
list_first_entry_or_null(&sched->pending_list,<br>
> >
>
typeof(*next), list);<br>
> > > - if (next)<br>
> > > +<br>
> > > + if (next) {<br>
> > >
next->s_fence->scheduled.timestamp =<br>
> > >
job->s_fence->finished.timestamp;<br>
> > > -<br>
> > > + /* start TO timer for
next job */<br>
> > > +
drm_sched_start_timeout(sched);<br>
> > > + }<br>
> > > } else {<br>
> > > job = NULL;<br>
> > > - /* queue timeout for next job
*/<br>
> > > -
drm_sched_start_timeout(sched);<br>
> > > }<br>
> > ><br>
> > >
spin_unlock(&sched->job_list_lock);<br>
> > > @@ -791,11 +786,8 @@ static int
drm_sched_main(void *param)<br>
> > >
(entity = drm_sched_select_entity(sched))) ||<br>
> > >
kthread_should_stop());<br>
> > ><br>
> > > - if (cleanup_job) {<br>
> > > + if (cleanup_job)<br>
> > >
sched->ops->free_job(cleanup_job);<br>
> > > - /* queue timeout for
next job */<br>
> > > -
drm_sched_start_timeout(sched);<br>
> > > - }<br>
> > ><br>
> > > if (!entity)<br>
> > > continue;<br>
> ><br>
</div>
</span></font></div>
</blockquote>
<br>
</body>
</html>