<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Am 12.12.2016 um 16:20 schrieb Nicolai
Hähnle:<br>
</div>
<blockquote
cite="mid:9153dde6-9087-94b0-a062-dd223e5f381a@gmail.com"
type="cite">Hi all,
<br>
<br>
I just sent out two patches that hopefully make the kernel module
more robust in the face of page table shadows being swapped out.
<br>
<br>
However, even with those patches, I can still fairly reliably
reproduce crashes with a backtrace of the shape
<br>
<br>
amdgpu_cs_ioctl
<br>
-> amdgpu_vm_update_page_directory
<br>
-> amdgpu_ttm_bind
<br>
-> amdgpu_gtt_mgr_alloc
<br>
<br>
The plausible reason for these crashes is that nothing seems to
prevent the shadow BOs from being moved between the calls to
amdgpu_cs_validate in amdgpu_cs_parser_bos and the calls to
amdgpu_ttm_bind.
<br>
</blockquote>
<br>
The shadow BOs use the same reservation object than the real BOs. So
as long as the real BOs can't be evicted the shadows can't be
evicted either.<br>
<br>
<blockquote
cite="mid:9153dde6-9087-94b0-a062-dd223e5f381a@gmail.com"
type="cite">
<br>
The attached patch has fixed these crashes for me so far, but it's
very heavy-handed: it collects all page table shadows and the page
directory shadow and adds them all to the reservations for the
callers of amdgpu_vm_update_page_directory.
<br>
</blockquote>
<br>
That is most likely just a timing change, cause the shadows should
end up in the duplicates list anyway. So the patch shouldn't have
any effect.<br>
<br>
<blockquote
cite="mid:9153dde6-9087-94b0-a062-dd223e5f381a@gmail.com"
type="cite">
<br>
I feel like there should be a better way. In part, I wonder why
the shadows are needed in the first place. I vaguely recall the
discussions about GPU reset and such, but I don't remember why the
page tables can't just be rebuilt in some other way.
<br>
</blockquote>
<br>
It's just the simplest and fastest way to keep a copy of the page
tables around.<br>
<br>
The problem with rebuilding the page tables from the mappings is
that the housekeeping structures already have the future state when
a reset happens, not the state we need to rebuild the tables.<br>
<br>
We could obviously change the housekeeping a bit to keep both
states, but that would complicate mapping and unmapping of BOs
significantly.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<blockquote
cite="mid:9153dde6-9087-94b0-a062-dd223e5f381a@gmail.com"
type="cite">
<br>
Cheers,
<br>
Nicolai
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
</blockquote>
<p><br>
</p>
</body>
</html>