<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<p style="font-family:Arial;font-size:10pt;color:#0000FF;margin:5pt;font-style:normal;font-weight:normal;text-decoration:none;" align="Left">
[AMD Official Use Only - General]<br>
</p>
<br>
<div>
<p style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</p>
<div class="moz-cite-prefix">On 2023-10-03 17:37, Felix Kuehling wrote:<br>
</div>
<blockquote type="cite">On 2023-10-03 16:50, Philip Yang wrote: <br>
<blockquote type="cite" style="color: rgb(0, 124, 255);">If there is no VRAM domain, bo_node is NULL and this causes crash.
<br>
Refactor the change, and use the module parameter as higher privilege. <br>
<br>
Need another patch to support override PTE flag on APU. <br>
<br>
Fixes: 55d7e2001c7e ("drm/amdgpu: Add EXT_COHERENT memory allocation flags") <br>
Signed-off-by: Philip Yang <a href="mailto:Philip.Yang@amd.com" class="moz-txt-link-rfc2396E OWAAutoLink" data-loopstyle="linkonly" id="OWA5551543b-0498-5b81-127d-578773d2e4b2">
<Philip.Yang@amd.com></a> <br>
--- <br>
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 +++++++----------- <br>
1 file changed, 7 insertions(+), 11 deletions(-) <br>
<br>
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
<br>
index 0d88698ae33f..305b2c54edfa 100644 <br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c <br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c <br>
@@ -1248,26 +1248,22 @@ svm_range_get_pte_flags(struct kfd_node *node, <br>
break; <br>
case IP_VERSION(9, 4, 3): <br>
mtype_local = amdgpu_mtype_local == 1 ? AMDGPU_VM_MTYPE_NC : <br>
- (amdgpu_mtype_local == 2 ? AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW);
<br>
+ (amdgpu_mtype_local == 2 || ext_coherent ? <br>
+ AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW); <br>
</blockquote>
<br>
We had some offline discussion where I thought that MTYPE_NC should <br>
become MTYPE_UC when ext_coherent is enabled to get the desired memory <br>
semantics. With that idea in mind, this would become a bit more messy, <br>
but here it goes, as clean as I can make it: <br>
<br>
- mtype_local = amdgpu_mtype_local == 1 ? AMDGPU_VM_MTYPE_NC : <br>
- (amdgpu_mtype_local == 2 ? AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW);
<br>
+ mtype_local = amdgpu_mtype_local == 1 && !ext_coherent ? AMDGPU_VM_MTYPE_NC :
<br>
+ (amdgpu_mtype_local == 1 && ext_coherent ? AMDGPU_VM_MTYPE_UC :
<br>
+ (amdgpu_mtype_local == 2 || ext_coherent ? AMDGPU_VM_MTYPE_CC :
<br>
+ AMDGPU_VM_MTYPE_RW));
<br>
<br>
</blockquote>
<p>That ternary looks fairly gnarly. I think it would be worth the extra ink to write
<br>
</p>
<p> mtype_local = amdgpu_mtype_local == 1 ? AMDGPU_VM_MTYPE_NC :<br>
(amdgpu_mtype_local == 2 ? AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW);<br>
<br>
if (ext_coherent) {<br>
if (amdgpu_mtype_local = 1)<br>
mtype_local = AMDGPU_VM_MTYPE_UC;<br>
else<br>
mtype_local = AMDGPU_VM_MTYPE_CC;<br>
}<br>
</p>
<blockquote type="cite">But maybe that could be fixed up in a follow up patch. Either way, for
<br>
the purpose of fixing the crash, this patch is <br>
<br>
Reviewed-by: Felix Kuehling <a href="mailto:Felix.Kuehling@amd.com" class="moz-txt-link-rfc2396E OWAAutoLink" data-loopstyle="linkonly" id="OWA0fb4eb3e-e5a8-e7ad-ed11-bf5f07139e99">
<Felix.Kuehling@amd.com></a> <br>
<br>
<br>
<blockquote type="cite" style="color: rgb(0, 124, 255);"> snoop = true;
<br>
if (uncached) { <br>
mapping_flags |= AMDGPU_VM_MTYPE_UC; <br>
- } else if (ext_coherent) { <br>
- /* local HBM region close to partition */ <br>
- if (bo_node->adev == node->adev && <br>
- (!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id == node->xcp->mem_id))
<br>
- mapping_flags |= AMDGPU_VM_MTYPE_CC; <br>
- else <br>
- mapping_flags |= AMDGPU_VM_MTYPE_UC; <br>
} else if (domain == SVM_RANGE_VRAM_DOMAIN) { <br>
/* local HBM region close to partition */ <br>
if (bo_node->adev == node->adev && <br>
(!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id == node->xcp->mem_id))
<br>
mapping_flags |= mtype_local; <br>
- /* local HBM region far from partition or remote XGMI GPU */
<br>
- else if (svm_nodes_in_same_hive(bo_node, node)) <br>
+ /* local HBM region far from partition or remote XGMI GPU <br>
+ * with regular system scope coherence <br>
+ */ <br>
+ else if (svm_nodes_in_same_hive(bo_node, node) && !ext_coherent)
<br>
mapping_flags |= AMDGPU_VM_MTYPE_NC; <br>
- /* PCIe P2P */ <br>
+ /* PCIe P2P or extended system scope coherence */ <br>
else <br>
mapping_flags |= AMDGPU_VM_MTYPE_UC; <br>
</blockquote>
</blockquote>
<p>Would probably clearer if these two branches were swapped so the first was <br>
</p>
<p>(!svm_nodes_in_same_hive(bo_node, node) || ext_coherent)</p>
<p>Not a required change, though.<br>
</p>
<blockquote type="cite">
<blockquote type="cite" style="color: rgb(0, 124, 255);"> /* system memory accessed by the APU */
</blockquote>
</blockquote>
<p>This patch as written causes ext_coherent to no longer affect gfx9.4.3 APU devices, which it should.</p>
<p>The following (or equivalent) needs to be added just below this hunk<br>
</p>
<p> if (num_possible_nodes() <= 1)<br>
mapping_flags |= mtype_local;<br>
else<br>
- mapping_flags |= AMDGPU_VM_MTYPE_NC;<br>
+ mapping_flags |= ext_coherent ? AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC;
<br>
</p>
</div>
</body>
</html>