[PATCH v2 1/6] drm/amdgpu: Support contiguous VRAM allocation
Christian König
ckoenig.leichtzumerken at gmail.com
Thu Apr 18 14:37:57 UTC 2024
Am 18.04.24 um 15:57 schrieb Philip Yang:
> RDMA device with limited scatter-gather ability requires contiguous VRAM
> buffer allocation for RDMA peer direct support.
>
> Add a new KFD alloc memory flag and store as bo alloc flag
> AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS. When pin this bo to export for RDMA
> peerdirect access, this will set TTM_PL_FLAG_CONTIFUOUS flag, and ask
> VRAM buddy allocator to get contiguous VRAM.
>
> Remove the 2GB max memory block size limit for contiguous allocation.
>
> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 ++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 9 +++++++--
> include/uapi/linux/kfd_ioctl.h | 1 +
> 3 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 0ae9fd844623..ef9154043757 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1712,6 +1712,10 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
> alloc_flags = AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
> alloc_flags |= (flags & KFD_IOC_ALLOC_MEM_FLAGS_PUBLIC) ?
> AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED : 0;
> +
> + /* For contiguous VRAM allocation */
> + if (flags & KFD_IOC_ALLOC_MEM_FLAGS_CONTIGUOUS_BEST_EFFORT)
> + alloc_flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
> }
> xcp_id = fpriv->xcp_id == AMDGPU_XCP_NO_PARTITION ?
> 0 : fpriv->xcp_id;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 4be8b091099a..2f2ae7177771 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -532,8 +532,13 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
>
> BUG_ON(min_block_size < mm->chunk_size);
>
> - /* Limit maximum size to 2GiB due to SG table limitations */
> - size = min(remaining_size, 2ULL << 30);
> + if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
> + size = remaining_size;
> + else
> + /* Limit maximum size to 2GiB due to SG table limitations
> + * for no contiguous allocation.
> + */
> + size = min(remaining_size, 2ULL << 30);
Oh, I totally missed this in the first review. That won't work like that
the sg table limit is still there even if the BO is contiguous.
We could only fix up the VRAM P2P support to use multiple segments in
the sg table.
Regards,
Christian.
>
> if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
> !(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
> diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
> index 2040a470ddb4..c1394c162d4e 100644
> --- a/include/uapi/linux/kfd_ioctl.h
> +++ b/include/uapi/linux/kfd_ioctl.h
> @@ -407,6 +407,7 @@ struct kfd_ioctl_acquire_vm_args {
> #define KFD_IOC_ALLOC_MEM_FLAGS_COHERENT (1 << 26)
> #define KFD_IOC_ALLOC_MEM_FLAGS_UNCACHED (1 << 25)
> #define KFD_IOC_ALLOC_MEM_FLAGS_EXT_COHERENT (1 << 24)
> +#define KFD_IOC_ALLOC_MEM_FLAGS_CONTIGUOUS_BEST_EFFORT (1 << 23)
>
> /* Allocate memory for later SVM (shared virtual memory) mapping.
> *
More information about the amd-gfx
mailing list