[PATCH] drm/amdkfd: Fix EXT_COHERENT memory allocation crash
Felix Kuehling
felix.kuehling at amd.com
Tue Oct 3 17:55:30 UTC 2023
On 2023-10-03 12:57, Philip Yang wrote:
> If there is no VRAM domain, bo_node is NULL and this causes crash. Move
> the EXT_COHERENT support change to VRAM domain path.
>
> Need another patch to support override PTE flag on APU.
>
> Fixes: 55d7e2001c7e ("drm/amdgpu: Add EXT_COHERENT memory allocation flags")
> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 14 ++++++--------
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 0d88698ae33f..150a3e88691d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -1252,19 +1252,17 @@ svm_range_get_pte_flags(struct kfd_node *node,
> snoop = true;
> if (uncached) {
> mapping_flags |= AMDGPU_VM_MTYPE_UC;
> - } else if (ext_coherent) {
> - /* local HBM region close to partition */
> - if (bo_node->adev == node->adev &&
> - (!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id == node->xcp->mem_id))
> - mapping_flags |= AMDGPU_VM_MTYPE_CC;
> - else
> - mapping_flags |= AMDGPU_VM_MTYPE_UC;
> } else if (domain == SVM_RANGE_VRAM_DOMAIN) {
> /* local HBM region close to partition */
> if (bo_node->adev == node->adev &&
> (!bo_node->xcp || !node->xcp || bo_node->xcp->mem_id == node->xcp->mem_id))
> - mapping_flags |= mtype_local;
> + if (ext_coherent)
> + mapping_flags |= AMDGPU_VM_MTYPE_CC;
> + else
> + mapping_flags |= mtype_local;
I'd prefer if this did not override the local mtype unless we're using
the default. So I'd recommend a patch that changes
mtype_local = amdgpu_mtype_local == 1 ? AMDGPU_VM_MTYPE_NC :
- (amdgpu_mtype_local == 2 ? AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW);
+ (amdgpu_mtype_local == 2 || ext_coherent ? AMDGPU_VM_MTYPE_CC :
+ AMDGPU_VM_MTYPE_RW);
...
- /* local HBM region far from partition or remote XGMI GPU */
- else if (svm_nodes_in_same_hive(bo_node, node))
+ /* local HBM region far from partition or remote XGMI GPU with regular system scope coherence */
+ else if (svm_nodes_in_same_hive(bo_node, node) && !ext_coherent)
mapping_flags |= AMDGPU_VM_MTYPE_NC;
- /* PCIe P2P */
+ /* PCIe P2P or extended system scope coherence */
else
mapping_flags |= AMDGPU_VM_MTYPE_UC;
Regards,
Felix
> /* local HBM region far from partition or remote XGMI GPU */
> + else if (ext_coherent)
> + mapping_flags |= AMDGPU_VM_MTYPE_UC;
> else if (svm_nodes_in_same_hive(bo_node, node))
> mapping_flags |= AMDGPU_VM_MTYPE_NC;
> /* PCIe P2P */
More information about the amd-gfx
mailing list