<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">I would rather like to avoid taking the
lock in the hot path.<br>
<br>
How about this:<br>
<br>
/* For killed process disable any more IBs enqueue right now
*/<br>
last_user = cmpxchg(&entity->last_user,
current->group_leader, NULL);<br>
if ((!last_user || last_user == current->group_leader)
&&<br>
(current->flags & PF_EXITING) &&
(current->exit_code == SIGKILL)) {<br>
grab_lock();<br>
drm_sched_rq_remove_entity(entity->rq, entity);<br>
if (READ_ONCE(&entity->last_user) != NULL)<br>
drm_sched_rq_add_entity(entity->rq, entity);<br>
drop_lock();<br>
}<br>
<br>
Christian.<br>
<br>
Am 13.08.2018 um 18:43 schrieb Andrey Grodzovsky:<br>
</div>
<blockquote type="cite"
cite="mid:82109a00-aebf-1e5f-5346-eef541a361df@amd.com">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<p>Attached. </p>
<p>If the general idea in the patch is OK I can think of a test
(and maybe add to libdrm amdgpu tests) to actually simulate this
scenario with 2 forked</p>
<p>concurrent processes working on same entity's job queue when
one is dying while the other keeps pushing to the same queue.
For now I only tested it</p>
<p>with normal boot and ruining multiple glxgears concurrently -
which doesn't really test this code path since i think each of
them works on it's own FD.<br>
</p>
<p>Andrey<br>
</p>
<br>
<div class="moz-cite-prefix">On 08/10/2018 09:27 AM, Christian
König wrote:<br>
</div>
<blockquote type="cite"
cite="mid:5bf40a54-18f9-98fd-a3df-dd0b8da0a424@gmail.com">
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8">
<div class="moz-cite-prefix">Crap, yeah indeed that needs to be
protected by some lock.<br>
<br>
Going to prepare a patch for that,<br>
Christian.<br>
<br>
Am 09.08.2018 um 21:49 schrieb Andrey Grodzovsky:<br>
</div>
<blockquote type="cite"
cite="mid:54621fc1-7246-f1bf-26bb-a16c4daf249f@amd.com">
<p>Reviewed-by: Andrey Grodzovsky <a
class="moz-txt-link-rfc2396E"
href="mailto:andrey.grodzovsky@amd.com"
moz-do-not-send="true"><andrey.grodzovsky@amd.com></a></p>
<p><br>
</p>
<p>But I still have questions about entity->last_user
(didn't notice this before) - <br>
</p>
<p>Looks to me there is a race condition with it's current
usage, let's say process A was preempted after doing
drm_sched_entity_flush->cmpxchg(...)</p>
<p>now process B working on same entity (forked) is inside
drm_sched_entity_push_job, he writes his PID to
entity->last_user and also</p>
<p>executes drm_sched_rq_add_entity. Now process A runs again
and execute drm_sched_rq_remove_entity inadvertently causing
process B removal</p>
<p>from it's scheduler rq.</p>
<p>Looks to me like instead we should lock together
entity->last_user accesses and adds/removals of entity to
the rq.</p>
<p>Andrey<br>
</p>
<br>
<div class="moz-cite-prefix">On 08/06/2018 10:18 AM, Nayan
Deshmukh wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAFd4ddzyvHPHepAgs=mjyWVj0WDV_pQbE9x7aHwNZ_zcME6fqQ@mail.gmail.com">
<div dir="ltr">
<div>
<div>I forgot about this since we started discussing
possible scenarios of processes and threads.<br>
<br>
</div>
In any case, this check is redundant. Acked-by: Nayan
Deshmukh <<a href="mailto:nayan26deshmukh@gmail.com"
moz-do-not-send="true">nayan26deshmukh@gmail.com</a>><br>
<br>
</div>
Nayan<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Mon, Aug 6, 2018 at 7:43 PM Christian
König <<a
href="mailto:ckoenig.leichtzumerken@gmail.com"
moz-do-not-send="true">ckoenig.leichtzumerken@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Ping.
Any objections to that?<br>
<br>
Christian.<br>
<br>
Am 03.08.2018 um 13:08 schrieb Christian König:<br>
> That is superflous now.<br>
><br>
> Signed-off-by: Christian König <<a
href="mailto:christian.koenig@amd.com" target="_blank"
moz-do-not-send="true">christian.koenig@amd.com</a>><br>
> ---<br>
> drivers/gpu/drm/scheduler/gpu_scheduler.c | 5
-----<br>
> 1 file changed, 5 deletions(-)<br>
><br>
> diff --git
a/drivers/gpu/drm/scheduler/gpu_scheduler.c
b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
> index 85908c7f913e..65078dd3c82c 100644<br>
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c<br>
> @@ -590,11 +590,6 @@ void
drm_sched_entity_push_job(struct drm_sched_job
*sched_job,<br>
> if (first) {<br>
> /* Add the entity to the run queue */<br>
> spin_lock(&entity->rq_lock);<br>
> - if (!entity->rq) {<br>
> - DRM_ERROR("Trying to push to
a killed entity\n");<br>
> -
spin_unlock(&entity->rq_lock);<br>
> - return;<br>
> - }<br>
>
drm_sched_rq_add_entity(entity->rq, entity);<br>
> spin_unlock(&entity->rq_lock);<br>
>
drm_sched_wakeup(entity->rq->sched);<br>
<br>
</blockquote>
</div>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
dri-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org">dri-devel@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/dri-devel">https://lists.freedesktop.org/mailman/listinfo/dri-devel</a>
</pre>
</blockquote>
<br>
</body>
</html>