<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">
<blockquote type="cite">And then try again (until ?).</blockquote>
The LRU is empty.<br>
<br>
See you got one LRU per domain, so while evicting the buffer from
VRAM it is moved to the GTT domain and also removed from the LRU
domain.<br>
<br>
When no other task is trying to do a CS the LRU will sooner or
later become empty.<br>
<br>
One possibility what happens here is that another process/thread
is moving buffers back in while the first process is trying to
evict them.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 14.03.2017 um 17:31 schrieb Julien Isorce:<br>
</div>
<blockquote
cite="mid:CAHWPjbVcb8vYwNmR8bAEbQu_C9dEmPNWS9hCLmqaJPH7F8X75w@mail.gmail.com"
type="cite">
<div dir="ltr">Hello,
<div><br>
</div>
<div>While debugging a softlock that happens on an
ioctl(RADEON_CS), I found that it keeps looping indefinitely
in the following loop: </div>
<div><a moz-do-not-send="true"
href="https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/ttm/ttm_bo.c#L819">https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/ttm/ttm_bo.c#L819</a></div>
<div><br>
</div>
<div>That would be great if someone could explain the logic
behind this loop iteration. My understanding is that it tries
to get a free node to put the current buffer object by calling
"ttm_bo_man_get_node". If it fails with mem->mm_node as
NULL (internally -ENOSPC) then it tries to evict another
buffer from the LRU by calling "ttm_mem_evict_first". And then
try again (until ?).</div>
<div><br>
</div>
<div>For some reasons, after some points while running an app
that GL upload a lot of images, these 2 functions keeps
returning 0 with mem->mm_node as NULL so the "while (true)"
keeps looping indefinitely. Which results in the process to be
stuck in that ioctl for ever.</div>
<div><br>
</div>
<div>A nasty workaround is to break the loop after a threshold
for the number of iterations. It looks like it very rarely
goes over 200. So breaking if > 200 iteration and returning
-ENOMEM allows the application to get the hand back instead of
being stuck. This is quite helpful for the debugging phase but
definitely not a proper fix.</div>
<div><br>
</div>
<div>A colleague found that changing ttm_bo_unreserve by
__ttm_bo_unreserve here <a moz-do-not-send="true"
href="https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/ttm/ttm_bo.c#L751">https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/ttm/ttm_bo.c#L751</a>
fixes this softlock. Because the later does not re-add the
evicted buffer to the LRU.</div>
<div>But we are unsure whether this is a proper fix or just a
workaround, providing this line exists since the first TTM
commit in 2009. Any comment ?</div>
<div><br>
</div>
<div>Also it looks like there is a recursion from:</div>
<div><br>
</div>
<div>
<div>radeon_cs_ioctl</div>
<div>radeon_cs_parser_relocs</div>
<div>radeon_bo_list_validate</div>
<div>ttm_bo_validate</div>
<div>ttm_bo_move_buffer</div>
<div>ttm_bo_mem_space @</div>
<div>ttm_bo_mem_force_space</div>
<div>ttm_mem_evict_first</div>
<div>ttm_bo_evict</div>
<div>ttm_bo_mem_space @</div>
</div>
<div>ttm_mem_evict_first</div>
<div>...</div>
<div><br>
</div>
<div>It looks it is meant to work like this but this make it
complicated to follow. So any input would be much appreciated.
Especially about the eviction mechanism + bo->evicted flag
and how TTM manages the LRU for corner cases like when the
VRAM is full.</div>
<div><br>
</div>
<div>I tried kernel 4.4, 4.8 and git HEAD from last week.</div>
<div><br>
</div>
<div>Thx</div>
<div>Julien</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
dri-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org">dri-devel@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/dri-devel">https://lists.freedesktop.org/mailman/listinfo/dri-devel</a>
</pre>
</blockquote>
<p><br>
</p>
</body>
</html>