<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Harish,<br>
<br>
for the next time please use git send-email to send patches to the
mailing list.<br>
<br>
Looks pretty good in general, but a few notes all over the place.<br>
<br>
Patch #1:<br>
<blockquote type="cite">+static bool amdgpu_vm_is_large_bar(struct
amdgpu_device *adev)<br>
+{<br>
+ if (adev->mc.visible_vram_size > 0 &&<br>
+ adev->mc.real_vram_size ==
adev->mc.visible_vram_size)<br>
+ return true;<br>
+ return false;<br>
+}<br>
</blockquote>
The visible_vram_size > 0 check looks superfluous. The coding
style looks off, the second line of the "if" is to far right.<br>
<br>
And in general that should rather look like "return
adev->mc.real_vram_size == adev->mc.visible_vram_size;".<br>
<br>
<blockquote type="cite">+ /* CPU update is only supported for
large BAR system */<br>
+ vm->is_vm_update_mode_cpu = (vm_update_mode &&<br>
+ amdgpu_vm_is_large_bar(adev));</blockquote>
Mhm, it would be nice if we could enable this for testing even on
systems with small BAR.<br>
<br>
I would rather suggest to make the default value depend on if the
BOX has a large or small BAR and let the user override the
setting.<br>
<br>
Additional to that we should limit this to 64bit systems.<br>
<br>
<blockquote type="cite">+#define
AMDGPU_VM_USE_CPU_UPDATE_FOR_GRAPHICS_MASK (1 << 0)<br>
+#define AMDGPU_VM_USE_CPU_UPDATE_FOR_COMPUTE_MASK (1 <<
1)</blockquote>
Not much of an issue, but the names are a bit long. Maybe use
something like AMDGPU_VM_USE_CPU_FOR_GFX and
AMDGPU_VM_USE_CPU_FOR_COMPUTE.<br>
<br>
<blockquote type="cite">+ bool
is_vm_update_mode_cpu : 1;</blockquote>
Please no bitmasks when we don't need them.<br>
<br>
Patch #2:<br>
<blockquote type="cite">+ u64 flags;<br>
unsigned shift = (adev->vm_manager.num_level - level) *<br>
adev->vm_manager.block_size;<br>
</blockquote>
Reverse tree order.<br>
<br>
<blockquote type="cite">+ flags =
(vm->is_vm_update_mode_cpu) ?<br>
+ AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED :<br>
+ AMDGPU_GEM_CREATE_NO_CPU_ACCESS |<br>
+ AMDGPU_GEM_CREATE_SHADOW;<br>
+<br>
+<br>
</blockquote>
...<br>
<blockquote type="cite">-
AMDGPU_GEM_CREATE_NO_CPU_ACCESS |<br>
- AMDGPU_GEM_CREATE_SHADOW |<br>
+ flags |<br>
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |<br>
AMDGPU_GEM_CREATE_VRAM_CLEARED,</blockquote>
<br>
I would rather write it like this:<br>
<br>
flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
AMDGPU_GEM_CREATE_VRAM_CLEARED;<br>
if (vm->is_vm_update_mode_cpu)<br>
flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;<br>
else<br>
flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
AMDGPU_GEM_CREATE_SHADOW;<br>
<br>
<blockquote type="cite">+ mb();<br>
+ amdgpu_gart_flush_gpu_tlb(params->adev, 0);<br>
</blockquote>
You do this for the HDP flush, don't you?<br>
<br>
<blockquote type="cite">+ dev_warn(adev->dev, "Page
table update using CPU failed. Fallback to SDMA\n");<br>
</blockquote>
NACK, if kmapping the BO fails we most likely are either out of
memory or out of address space.<br>
<br>
Falling back to the SDMA doesn't make the situation better, just
return a proper error code here.<br>
<br>
<blockquote type="cite">+ /* Wait for BO to be free.
SDMA could be clearing it */<br>
+ amdgpu_sync_create(&sync);<br>
+ amdgpu_sync_resv(adev, &sync,
parent->bo->tbo.resv,<br>
+ AMDGPU_FENCE_OWNER_VM);<br>
+ amdgpu_sync_wait(&sync);<br>
+ amdgpu_sync_free(&sync);</blockquote>
Superfluous and a bit incorrect, amdgpu_bo_kmap already calls
reservation_object_wait_timeout_rcu() but only for the exclusive
fence.<br>
<br>
You probably ran into problems because of pipelined evictions, so
that should be fixed in amdgpu_bo_kmap() instead.<br>
<br>
<blockquote type="cite">-
amdgpu_vm_do_set_ptes(¶ms,<br>
- last_shadow,<br>
- pt_addr, count,<br>
- incr,<br>
- AMDGPU_PTE_VALID);<br>
-<br>
- amdgpu_vm_do_set_ptes(¶ms, last_pde,<br>
- pt_addr, count, incr,<br>
- AMDGPU_PTE_VALID);<br>
+ params.func(¶ms,<br>
+ last_shadow,<br>
+ pt_addr, count,<br>
+ incr,<br>
+ AMDGPU_PTE_VALID);<br>
+<br>
+ params.func(¶ms, last_pde,<br>
+ pt_addr, count, incr,<br>
+ AMDGPU_PTE_VALID);</blockquote>
Might be a good idea to separate that into a cleanup patch.<br>
<br>
Patch #3: Looks good to me and is Reviewed-by: Christian König
<a class="moz-txt-link-rfc2396E" href="mailto:christian.koenig@amd.com"><christian.koenig@amd.com></a>.<br>
<br>
Please move that one to the beginning of the series and/or commit
it directly to amd-staging-4.9.<br>
<br>
Patch #4:<br>
<blockquote type="cite">+ /* The next three are used during VM
update by CPU */<br>
+ bool update_by_cpu;</blockquote>
We can distinguish that as well by looking at the function
pointer, don't we?<br>
<br>
<blockquote type="cite">+ dev_warn(adev->dev,<br>
+ "CPU update of VM failed. Fallback to SDMA\n");<br>
+<br>
+ /* Reset params for SDMA fallback path */<br>
+ params.update_by_cpu = false;<br>
+ params.pages_addr = NULL;<br>
+ params.kptr = NULL;<br>
+ params.func = NULL;</blockquote>
Again completely drop the SDMA fallback path.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 09.05.2017 um 20:34 schrieb Kasiviswanathan, Harish:<br>
</div>
<blockquote
cite="mid:DM3PR1201MB103847C11F89904752F962768CEF0@DM3PR1201MB1038.namprd12.prod.outlook.com"
type="cite">
<pre wrap="">Hi,
Please review the patch set that supports amdgpu VM update via CPU. This feature provides improved performance for compute (HSA) where mapping / unmapping is carried out (by Kernel) independent of command submissions (done directly by user space). This version doesn't support shadow copy of VM page tables for CPU based update.
Best Regards,
Harish
</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
</blockquote>
<p><br>
</p>
</body>
</html>