<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Am 02.08.2018 um 16:11 schrieb Andrey
      Grodzovsky:<br>
    </div>
    <blockquote type="cite"
      cite="mid:5d3ec8da-db2e-0aa2-0a7c-a55f27b27a04@amd.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <p><br>
      </p>
      <br>
      <div class="moz-cite-prefix">On 08/02/2018 02:47 AM, Nayan
        Deshmukh wrote:<br>
      </div>
      <blockquote type="cite"
cite="mid:CAFd4ddwKJvfrTu5uAub1K517DhdcOZbZ8k=gyzPy6GVqx_PhtA@mail.gmail.com">
        <div dir="ltr"><br>
          <br>
          <div class="gmail_quote">
            <div dir="ltr">On Thu, Aug 2, 2018 at 12:12 PM Christian
              König <<a
                href="mailto:ckoenig.leichtzumerken@gmail.com"
                moz-do-not-send="true">ckoenig.leichtzumerken@gmail.com</a>>
              wrote:<br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">Am
              02.08.2018 um 00:25 schrieb Andrey Grodzovsky:<br>
              > Thinking again about this change and 53d5f1e
              drm/scheduler: move idle <br>
              > entities to scheduler with less load v2<br>
              ><br>
              > Looks to me like use case for which fc9a539
              drm/scheduler: add NULL <br>
              > pointer check for run queue (v2) was done<br>
              ><br>
              > will not work anymore.<br>
              ><br>
              > First of all in drm_sched_entity_push_job, 'if
              (!entity->rq)' will <br>
              > never be true any more since we stopped entity->rq
              to NULL in<br>
              ><br>
              > drm_sched_entity_flush.<br>
              <br>
              Good point, going to remove that.<br>
            </blockquote>
          </div>
        </div>
      </blockquote>
      <br>
      So basically we don't need that if (...){ return; } in
      drm_sched_entity_push_job any more ?<br>
    </blockquote>
    <br>
    Yes, exactly.<br>
    <br>
    <blockquote type="cite"
      cite="mid:5d3ec8da-db2e-0aa2-0a7c-a55f27b27a04@amd.com"> <br>
      <blockquote type="cite"
cite="mid:CAFd4ddwKJvfrTu5uAub1K517DhdcOZbZ8k=gyzPy6GVqx_PhtA@mail.gmail.com">
        <div dir="ltr">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br>
              ><br>
              > Second of all, even after we removed the entity from
              rq in <br>
              > drm_sched_entity_flush to terminate any subsequent
              submissions<br>
              ><br>
              > to the queue the other thread doing push job can just
              acquire again a <br>
              > run queue<br>
              ><br>
              > from drm_sched_entity_get_free_sched and continue
              submission<br>
              <br>
            </blockquote>
            <div>Hi Christian <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex"> That is
              actually desired.<br>
              <br>
              When another process is now using the entity to submit
              jobs adding it <br>
              back to the rq is actually the right thing to do cause the
              entity is <br>
              still in use.<br>
            </blockquote>
          </div>
        </div>
      </blockquote>
      <br>
      Yes, no problem if it's another process. But what about another
      thread from same process ? Is it a possible use case that 2
      threads from same process submit to same entity concurrently ? If
      so and we specifically kill one, the other will not stop event if
      we want him to because current code makes him just require a rq
      for him self.<br>
    </blockquote>
    <br>
    Well you can't kill a single thread of a process (you can only
    interrupt it), a SIGKILL will always kill the whole process.<br>
    <br>
    <blockquote type="cite"
      cite="mid:5d3ec8da-db2e-0aa2-0a7c-a55f27b27a04@amd.com"> <br>
      <blockquote type="cite"
cite="mid:CAFd4ddwKJvfrTu5uAub1K517DhdcOZbZ8k=gyzPy6GVqx_PhtA@mail.gmail.com">
        <div dir="ltr">
          <div class="gmail_quote">
            <div>I am not aware of the usecase so I might just be
              rambling. but if the thread/process that created the
              entity has called drm_sched_entity_flush then we shouldn't
              allow other processes to push jobs to that entity. <br>
              <br>
            </div>
            <div>Nayan  <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br>
              Christian.<br>
            </blockquote>
          </div>
        </div>
      </blockquote>
      We don't really know who created the entity, the creation could be
      by X itself and then it's passed to other process for use. Check
      'drm/scheduler: only kill entity if last user is killed v2', the
      idea is that if by the time you want to<br>
      kill this entity another process (not another thread from your
      process - i.e. current->group_leader != last_user in
      drm_sched_entity_flush) already started to use this entity just
      let it be.<br>
    </blockquote>
    <br>
    Yes, exactly that's the idea.<br>
    <br>
    Christian.<br>
    <br>
    <blockquote type="cite"
      cite="mid:5d3ec8da-db2e-0aa2-0a7c-a55f27b27a04@amd.com"> <br>
      Andrey<br>
      <blockquote type="cite"
cite="mid:CAFd4ddwKJvfrTu5uAub1K517DhdcOZbZ8k=gyzPy6GVqx_PhtA@mail.gmail.com">
        <div dir="ltr">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br>
              > so you can't substitute ' if (!entity->rq)' to 'if
              <br>
              > (list_empty(&entity->list))'.<br>
              ><br>
              > What about adding a 'stopped' flag to
              drm_sched_entity to be set in <br>
              > drm_sched_entity_flush and in<br>
              ><br>
              > drm_sched_entity_push_job check for  'if
              (!entity->stopped)' instead <br>
              > of  ' if (!entity->rq)' ?<br>
              ><br>
              > Andrey<br>
              ><br>
              ><br>
              > On 07/30/2018 07:03 AM, Christian König wrote:<br>
              >> We removed the redundancy of having an extra
              scheduler field, so we<br>
              >> can't set the rq to NULL any more or otherwise
              won't know which<br>
              >> scheduler to use for the cleanup.<br>
              >><br>
              >> Just remove the entity from the scheduling list
              instead.<br>
              >><br>
              >> Signed-off-by: Christian König <<a
                href="mailto:christian.koenig@amd.com" target="_blank"
                moz-do-not-send="true">christian.koenig@amd.com</a>><br>
              >> ---<br>
              >>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 35
              <br>
              >> +++++++------------------------<br>
              >>   1 file changed, 8 insertions(+), 27
              deletions(-)<br>
              >><br>
              >> diff --git
              a/drivers/gpu/drm/scheduler/gpu_scheduler.c <br>
              >> b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
              >> index f563e4fbb4b6..1b733229201e 100644<br>
              >> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
              >> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
              >> @@ -198,21 +198,6 @@ int
              drm_sched_entity_init(struct <br>
              >> drm_sched_entity *entity,<br>
              >>   }<br>
              >>   EXPORT_SYMBOL(drm_sched_entity_init);<br>
              >>   -/**<br>
              >> - * drm_sched_entity_is_initialized - Query if
              entity is initialized<br>
              >> - *<br>
              >> - * @sched: Pointer to scheduler instance<br>
              >> - * @entity: The pointer to a valid scheduler
              entity<br>
              >> - *<br>
              >> - * return true if entity is initialized, false
              otherwise<br>
              >> -*/<br>
              >> -static bool
              drm_sched_entity_is_initialized(struct drm_gpu_scheduler <br>
              >> *sched,<br>
              >> -                        struct drm_sched_entity
              *entity)<br>
              >> -{<br>
              >> -    return entity->rq != NULL &&<br>
              >> -        entity->rq->sched == sched;<br>
              >> -}<br>
              >> -<br>
              >>   /**<br>
              >>    * drm_sched_entity_is_idle - Check if entity
              is idle<br>
              >>    *<br>
              >> @@ -224,7 +209,8 @@ static bool
              drm_sched_entity_is_idle(struct <br>
              >> drm_sched_entity *entity)<br>
              >>   {<br>
              >>       rmb();<br>
              >>   -    if (!entity->rq ||
              spsc_queue_peek(&entity->job_queue) == NULL)<br>
              >> +    if (list_empty(&entity->list) ||<br>
              >> +       
              spsc_queue_peek(&entity->job_queue) == NULL)<br>
              >>           return true;<br>
              >>         return false;<br>
              >> @@ -279,8 +265,6 @@ long
              drm_sched_entity_flush(struct <br>
              >> drm_sched_entity *entity, long timeout)<br>
              >>       long ret = timeout;<br>
              >>         sched = entity->rq->sched;<br>
              >> -    if (!drm_sched_entity_is_initialized(sched,
              entity))<br>
              >> -        return ret;<br>
              >>       /**<br>
              >>        * The client will not queue more IBs
              during this fini, consume <br>
              >> existing<br>
              >>        * queued IBs or discard them on SIGKILL<br>
              >> @@ -299,7 +283,7 @@ long
              drm_sched_entity_flush(struct <br>
              >> drm_sched_entity *entity, long timeout)<br>
              >>       last_user =
              cmpxchg(&entity->last_user,
              current->group_leader, <br>
              >> NULL);<br>
              >>       if ((!last_user || last_user ==
              current->group_leader) &&<br>
              >>           (current->flags & PF_EXITING)
              && (current->exit_code == <br>
              >> SIGKILL))<br>
              >> -        drm_sched_entity_set_rq(entity, NULL);<br>
              >> +       
              drm_sched_rq_remove_entity(entity->rq, entity);<br>
              >>         return ret;<br>
              >>   }<br>
              >> @@ -320,7 +304,7 @@ void
              drm_sched_entity_fini(struct <br>
              >> drm_sched_entity *entity)<br>
              >>       struct drm_gpu_scheduler *sched;<br>
              >>         sched = entity->rq->sched;<br>
              >> -    drm_sched_entity_set_rq(entity, NULL);<br>
              >> +    drm_sched_rq_remove_entity(entity->rq,
              entity);<br>
              >>         /* Consumption of existing IBs wasn't
              completed. Forcefully<br>
              >>        * remove them here.<br>
              >> @@ -416,15 +400,12 @@ void
              drm_sched_entity_set_rq(struct <br>
              >> drm_sched_entity *entity,<br>
              >>       if (entity->rq == rq)<br>
              >>           return;<br>
              >>   -    spin_lock(&entity->rq_lock);<br>
              >> -<br>
              >> -    if (entity->rq)<br>
              >> -       
              drm_sched_rq_remove_entity(entity->rq, entity);<br>
              >> +    BUG_ON(!rq);<br>
              >>   +    spin_lock(&entity->rq_lock);<br>
              >> +    drm_sched_rq_remove_entity(entity->rq,
              entity);<br>
              >>       entity->rq = rq;<br>
              >> -    if (rq)<br>
              >> -        drm_sched_rq_add_entity(rq, entity);<br>
              >> -<br>
              >> +    drm_sched_rq_add_entity(rq, entity);<br>
              >>       spin_unlock(&entity->rq_lock);<br>
              >>   }<br>
              >>   EXPORT_SYMBOL(drm_sched_entity_set_rq);<br>
              ><br>
              <br>
            </blockquote>
          </div>
        </div>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>