[PATCH] drm/amdkfd: Fix EXT_COHERENT memory allocation crash

Francis, David David.Francis at amd.com
Tue Oct 3 19:01:00 UTC 2023


[AMD Official Use Only - General]

> If there is no VRAM domain, bo_node is NULL and this causes crash. Move
> ​the EXT_COHERENT support change to VRAM domain path.
>
> Need another patch to support override PTE flag on APU.
>
> Fixes: 55d7e2001c7e ("drm/amdgpu: Add EXT_COHERENT memory allocation flags")
> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 0d88698ae33f..150a3e88691d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -1252,19 +1252,17 @@ svm_range_get_pte_flags(struct kfd_node *node,
>                 snoop = true;
>                 if (uncached) {
>                         mapping_flags |= AMDGPU_VM_MTYPE_UC;
> -               } else if (ext_coherent) {
> -                       /* local HBM region close to partition */
> -                       if (bo_node->adev == node->adev &&
> -                           (!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id == node->xcp->mem_id))
> -                               mapping_flags |= AMDGPU_VM_MTYPE_CC;
> -                       else
> -                               mapping_flags |= AMDGPU_VM_MTYPE_UC;
>                } else if (domain == SVM_RANGE_VRAM_DOMAIN) {
>                         /* local HBM region close to partition */
>                         if (bo_node->adev == node->adev &&
>                             (!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id == node->xcp->mem_id))
> -                               mapping_flags |= mtype_local;
> +                               if (ext_coherent)
> +                                       mapping_flags |= AMDGPU_VM_MTYPE_CC;
> +                               else
> +                                       mapping_flags |= mtype_local;

This if statement is more than one line long, so it should have "{}".

>                         /* local HBM region far from partition or remote XGMI GPU */
> +                       else if (ext_coherent)
> +                               mapping_flags |= AMDGPU_VM_MTYPE_UC;
>                         else if (svm_nodes_in_same_hive(bo_node, node))
>                                 mapping_flags |= AMDGPU_VM_MTYPE_NC;
>                         /* PCIe P2P */
________________________________
From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of Philip Yang <Philip.Yang at amd.com>
Sent: Tuesday, October 3, 2023 12:57 PM
To: amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
Cc: Yang, Philip <Philip.Yang at amd.com>; Kuehling, Felix <Felix.Kuehling at amd.com>
Subject: [PATCH] drm/amdkfd: Fix EXT_COHERENT memory allocation crash

Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


If there is no VRAM domain, bo_node is NULL and this causes crash. Move
the EXT_COHERENT support change to VRAM domain path.

Need another patch to support override PTE flag on APU.

Fixes: 55d7e2001c7e ("drm/amdgpu: Add EXT_COHERENT memory allocation flags")
Signed-off-by: Philip Yang <Philip.Yang at amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 0d88698ae33f..150a3e88691d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1252,19 +1252,17 @@ svm_range_get_pte_flags(struct kfd_node *node,
                snoop = true;
                if (uncached) {
                        mapping_flags |= AMDGPU_VM_MTYPE_UC;
-               } else if (ext_coherent) {
-                       /* local HBM region close to partition */
-                       if (bo_node->adev == node->adev &&
-                           (!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id == node->xcp->mem_id))
-                               mapping_flags |= AMDGPU_VM_MTYPE_CC;
-                       else
-                               mapping_flags |= AMDGPU_VM_MTYPE_UC;
                } else if (domain == SVM_RANGE_VRAM_DOMAIN) {
                        /* local HBM region close to partition */
                        if (bo_node->adev == node->adev &&
                            (!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id == node->xcp->mem_id))
-                               mapping_flags |= mtype_local;
+                               if (ext_coherent)
+                                       mapping_flags |= AMDGPU_VM_MTYPE_CC;
+                               else
+                                       mapping_flags |= mtype_local;
                        /* local HBM region far from partition or remote XGMI GPU */
+                       else if (ext_coherent)
+                               mapping_flags |= AMDGPU_VM_MTYPE_UC;
                        else if (svm_nodes_in_same_hive(bo_node, node))
                                mapping_flags |= AMDGPU_VM_MTYPE_NC;
                        /* PCIe P2P */
--
2.35.1

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20231003/b3608cf0/attachment-0001.htm>


More information about the amd-gfx mailing list