<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Turns out Steven's patch was already in so i just cherry-picked
      the change from drm-next-misc</p>
    <p><br>
    </p>
    <p>Emily - it's in.</p>
    <p><br>
    </p>
    <p>Andrey</p>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 12/3/19 2:59 PM, Deucher, Alexander
      wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:MWHPR12MB135874597AF738D51965F925F7420@MWHPR12MB1358.namprd12.prod.outlook.com">
      
      <style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
      <p style="font-family:Arial;font-size:10pt;color:#0078D7;margin:15pt;" align="Left">
        [AMD Official Use Only - Internal Distribution Only]<br>
      </p>
      <br>
      <div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);">
          Cherry pick whatever dependencies you need or pick the older
          version of the patch.  Either way works.</div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);">
          <br>
        </div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);">
          Alex<br>
        </div>
        <hr style="display:inline-block;width:98%" tabindex="-1">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b>
            Grodzovsky, Andrey <a class="moz-txt-link-rfc2396E" href="mailto:Andrey.Grodzovsky@amd.com"><Andrey.Grodzovsky@amd.com></a><br>
            <b>Sent:</b> Tuesday, December 3, 2019 2:57 PM<br>
            <b>To:</b> Deucher, Alexander
            <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a>; Deng, Emily
            <a class="moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com"><Emily.Deng@amd.com></a><br>
            <b>Cc:</b> <a class="moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org">dri-devel@lists.freedesktop.org</a>
            <a class="moz-txt-link-rfc2396E" href="mailto:dri-devel@lists.freedesktop.org"><dri-devel@lists.freedesktop.org></a>;
            <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
            <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Koenig, Christian
            <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a>; <a class="moz-txt-link-abbreviated" href="mailto:steven.price@arm.com">steven.price@arm.com</a>
            <a class="moz-txt-link-rfc2396E" href="mailto:steven.price@arm.com"><steven.price@arm.com></a><br>
            <b>Subject:</b> Re: [PATCH v4] drm/scheduler: Avoid
            accessing freed bad job.</font>
          <div> </div>
        </div>
        <div style="background-color:#FFFFFF">
          <p>I don't think i can apply this patch 'as is' as this has
            dependency on patch by Steven which also wasn't applied yet
            - 588b982 Steven Price        6 weeks ago    drm: Don't free
            jobs in wait_event_interruptible()</p>
          <p><br>
          </p>
          <p>Andrey<br>
          </p>
          <p><br>
          </p>
          <div class="x_moz-cite-prefix">On 12/3/19 2:44 PM, Deucher,
            Alexander wrote:<br>
          </div>
          <blockquote type="cite">
            <style type="text/css" style="display:none">
<!--
p
        {margin-top:0;
        margin-bottom:0}
-->
</style>
            <p style="font-family:Arial; font-size:10pt; color:#0078D7;
              margin:15pt" align="Left">
              [AMD Official Use Only - Internal Distribution Only]<br>
            </p>
            <br>
            <div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                Please go ahead an apply whatever version is necessary
                for amd-staging-drm-next.</div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <br>
              </div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                Alex</div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <br>
              </div>
              <hr tabindex="-1" style="display:inline-block; width:98%">
              <div id="x_divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b> Grodzovsky, Andrey
                  <a class="x_moz-txt-link-rfc2396E" href="mailto:Andrey.Grodzovsky@amd.com" moz-do-not-send="true"><Andrey.Grodzovsky@amd.com></a><br>
                  <b>Sent:</b> Tuesday, December 3, 2019 2:10 PM<br>
                  <b>To:</b> Deng, Emily <a class="x_moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com" moz-do-not-send="true">
                    <Emily.Deng@amd.com></a>; Deucher, Alexander <a class="x_moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true">
                    <Alexander.Deucher@amd.com></a><br>
                  <b>Cc:</b> <a class="x_moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org" moz-do-not-send="true">
                    dri-devel@lists.freedesktop.org</a> <a class="x_moz-txt-link-rfc2396E" href="mailto:dri-devel@lists.freedesktop.org" moz-do-not-send="true">
                    <dri-devel@lists.freedesktop.org></a>; <a class="x_moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">
                    amd-gfx@lists.freedesktop.org</a> <a class="x_moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">
                    <amd-gfx@lists.freedesktop.org></a>; Koenig,
                  Christian <a class="x_moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true">
                    <Christian.Koenig@amd.com></a>; <a class="x_moz-txt-link-abbreviated" href="mailto:steven.price@arm.com" moz-do-not-send="true">
                    steven.price@arm.com</a> <a class="x_moz-txt-link-rfc2396E" href="mailto:steven.price@arm.com" moz-do-not-send="true">
                    <steven.price@arm.com></a><br>
                  <b>Subject:</b> Re: [PATCH v4] drm/scheduler: Avoid
                  accessing freed bad job.</font>
                <div> </div>
              </div>
              <div class="x_BodyFragment"><font size="2"><span style="font-size:11pt">
                    <div class="x_PlainText">Yes - Christian just pushed
                      it to drm-next-misc - I guess Alex/Christian
                      <br>
                      didn't pull to amd-staging-drm-next yet.<br>
                      <br>
                      Andrey<br>
                      <br>
                      On 12/2/19 2:24 PM, Deng, Emily wrote:<br>
                      > [AMD Official Use Only - Internal
                      Distribution Only]<br>
                      ><br>
                      > Hi Andrey,<br>
                      >      Seems this patch is still not in
                      amd-staging-drm-next?<br>
                      ><br>
                      > Best wishes<br>
                      > Emily Deng<br>
                      ><br>
                      ><br>
                      ><br>
                      >> -----Original Message-----<br>
                      >> From: Deng, Emily<br>
                      >> Sent: Tuesday, November 26, 2019 4:41 PM<br>
                      >> To: Grodzovsky, Andrey <a class="x_moz-txt-link-rfc2396E" href="mailto:Andrey.Grodzovsky@amd.com" moz-do-not-send="true">
                        <Andrey.Grodzovsky@amd.com></a><br>
                      >> Cc: <a class="x_moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org" moz-do-not-send="true">
                        dri-devel@lists.freedesktop.org</a>; <a class="x_moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">
                        amd-gfx@lists.freedesktop.org</a>; Koenig,<br>
                      >> Christian <a class="x_moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true">
                        <Christian.Koenig@amd.com></a>; <a class="x_moz-txt-link-abbreviated" href="mailto:steven.price@arm.com" moz-do-not-send="true">
                        steven.price@arm.com</a><br>
                      >> Subject: RE: [PATCH v4] drm/scheduler:
                      Avoid accessing freed bad job.<br>
                      >><br>
                      >> [AMD Official Use Only - Internal
                      Distribution Only]<br>
                      >><br>
                      >> Reviewed-by: Emily Deng <a class="x_moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com" moz-do-not-send="true">
                        <Emily.Deng@amd.com></a><br>
                      >><br>
                      >>> -----Original Message-----<br>
                      >>> From: Grodzovsky, Andrey <a class="x_moz-txt-link-rfc2396E" href="mailto:Andrey.Grodzovsky@amd.com" moz-do-not-send="true">
                        <Andrey.Grodzovsky@amd.com></a><br>
                      >>> Sent: Tuesday, November 26, 2019 7:37
                      AM<br>
                      >>> Cc: <a class="x_moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org" moz-do-not-send="true">
                        dri-devel@lists.freedesktop.org</a>; <a class="x_moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">
                        amd-gfx@lists.freedesktop.org</a>;<br>
                      >>> Koenig, Christian <a class="x_moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true">
                        <Christian.Koenig@amd.com></a>; Deng,
                      Emily<br>
                      >>> <a class="x_moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com" moz-do-not-send="true"><Emily.Deng@amd.com></a>;
                      <a class="x_moz-txt-link-abbreviated" href="mailto:steven.price@arm.com" moz-do-not-send="true">steven.price@arm.com</a><br>
                      >>> Subject: Re: [PATCH v4]
                      drm/scheduler: Avoid accessing freed bad job.<br>
                      >>><br>
                      >>> Ping<br>
                      >>><br>
                      >>> Andrey<br>
                      >>><br>
                      >>> On 11/25/19 3:51 PM, Andrey
                      Grodzovsky wrote:<br>
                      >>>> Problem:<br>
                      >>>> Due to a race between
                      drm_sched_cleanup_jobs in sched thread and<br>
                      >>>> drm_sched_job_timedout in timeout
                      work there is a possiblity that bad<br>
                      >>>> job was already freed while still
                      being accessed from the timeout<br>
                      >>>> thread.<br>
                      >>>><br>
                      >>>> Fix:<br>
                      >>>> Instead of just peeking at the
                      bad job in the mirror list remove it<br>
                      >>>> from the list under lock and then
                      put it back later when we are<br>
                      >>>> garanteed no race with main sched
                      thread is possible which is after<br>
                      >>>> the thread is parked.<br>
                      >>>><br>
                      >>>> v2: Lock around processing
                      ring_mirror_list in drm_sched_cleanup_jobs.<br>
                      >>>><br>
                      >>>> v3: Rebase on top of
                      drm-misc-next. v2 is not needed anymore as<br>
                      >>>> drm_sched_get_cleanup_job already
                      has a lock there.<br>
                      >>>><br>
                      >>>> v4: Fix comments to relfect
                      latest code in drm-misc.<br>
                      >>>><br>
                      >>>> Signed-off-by: Andrey Grodzovsky
                      <a class="x_moz-txt-link-rfc2396E" href="mailto:andrey.grodzovsky@amd.com" moz-do-not-send="true">
                        <andrey.grodzovsky@amd.com></a><br>
                      >>>> Reviewed-by: Christian König <a class="x_moz-txt-link-rfc2396E" href="mailto:christian.koenig@amd.com" moz-do-not-send="true">
                        <christian.koenig@amd.com></a><br>
                      >>>> Tested-by: Emily Deng <a class="x_moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com" moz-do-not-send="true">
                        <Emily.Deng@amd.com></a><br>
                      >>>> ---<br>
                      >>>>   
                      drivers/gpu/drm/scheduler/sched_main.c | 27<br>
                      >>> +++++++++++++++++++++++++++<br>
                      >>>>    1 file changed, 27
                      insertions(+)<br>
                      >>>><br>
                      >>>> diff --git
                      a/drivers/gpu/drm/scheduler/sched_main.c<br>
                      >>>>
                      b/drivers/gpu/drm/scheduler/sched_main.c<br>
                      >>>> index 6774955..1bf9c40 100644<br>
                      >>>> ---
                      a/drivers/gpu/drm/scheduler/sched_main.c<br>
                      >>>> +++
                      b/drivers/gpu/drm/scheduler/sched_main.c<br>
                      >>>> @@ -284,10 +284,21 @@ static void
                      drm_sched_job_timedout(struct<br>
                      >>> work_struct *work)<br>
                      >>>>     unsigned long flags;<br>
                      >>>><br>
                      >>>>     sched = container_of(work,
                      struct drm_gpu_scheduler,<br>
                      >>>> work_tdr.work);<br>
                      >>>> +<br>
                      >>>> +  /* Protects against concurrent
                      deletion in<br>
                      >>> drm_sched_get_cleanup_job */<br>
                      >>>> + 
                      spin_lock_irqsave(&sched->job_list_lock,
                      flags);<br>
                      >>>>     job =
                      list_first_entry_or_null(&sched->ring_mirror_list,<br>
>>>>                                    struct
                      drm_sched_job, node);<br>
                      >>>><br>
                      >>>>     if (job) {<br>
                      >>>> +          /*<br>
                      >>>> +           * Remove the bad job
                      so it cannot be freed by concurrent<br>
                      >>>> +           *
                      drm_sched_cleanup_jobs. It will be reinserted back
                      after<br>
                      >>> sched->thread<br>
                      >>>> +           * is parked at which
                      point it's safe.<br>
                      >>>> +           */<br>
                      >>>> +         
                      list_del_init(&job->node);<br>
                      >>>> +         
                      spin_unlock_irqrestore(&sched->job_list_lock,
                      flags);<br>
                      >>>> +<br>
                      >>>>            
                      job->sched->ops->timedout_job(job);<br>
                      >>>><br>
                      >>>>             /*<br>
                      >>>> @@ -298,6 +309,8 @@ static void
                      drm_sched_job_timedout(struct<br>
                      >>> work_struct *work)<br>
                      >>>>                    
                      job->sched->ops->free_job(job);<br>
                      >>>>                    
                      sched->free_guilty = false;<br>
                      >>>>             }<br>
                      >>>> +  } else {<br>
                      >>>> +         
                      spin_unlock_irqrestore(&sched->job_list_lock,
                      flags);<br>
                      >>>>     }<br>
                      >>>><br>
                      >>>>    
                      spin_lock_irqsave(&sched->job_list_lock,
                      flags); @@ -370,6 +383,20<br>
                      >>>> @@ void drm_sched_stop(struct
                      drm_gpu_scheduler *sched, struct<br>
                      >>> drm_sched_job *bad)<br>
                      >>>>    
                      kthread_park(sched->thread);<br>
                      >>>><br>
                      >>>>     /*<br>
                      >>>> +   * Reinsert back the bad job
                      here - now it's safe as<br>
                      >>>> +   * drm_sched_get_cleanup_job
                      cannot race against us and release the<br>
                      >>>> +   * bad job at this point - we
                      parked (waited for) any in progress<br>
                      >>>> +   * (earlier) cleanups and
                      drm_sched_get_cleanup_job will not be<br>
                      >>> called<br>
                      >>>> +   * now until the scheduler
                      thread is unparked.<br>
                      >>>> +   */<br>
                      >>>> +  if (bad &&
                      bad->sched == sched)<br>
                      >>>> +          /*<br>
                      >>>> +           * Add at the head of
                      the queue to reflect it was the earliest<br>
                      >>>> +           * job extracted.<br>
                      >>>> +           */<br>
                      >>>> +         
                      list_add(&bad->node,
                      &sched->ring_mirror_list);<br>
                      >>>> +<br>
                      >>>> +  /*<br>
                      >>>>      * Iterate the job list from
                      later to  earlier one and either deactive<br>
                      >>>>      * their HW callbacks or
                      remove them from mirror list if they already<br>
                      >>>>      * signaled.<br>
                    </div>
                  </span></font></div>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
  </body>
</html>