[PATCH v5 4/6] drm/amdkfd: Evict BO itself for contiguous allocation

Felix Kuehling felix.kuehling at amd.com
Tue Apr 23 22:15:14 UTC 2024


On 2024-04-23 11:28, Philip Yang wrote:
> If the BO pages pinned for RDMA is not contiguous on VRAM, evict it to
> system memory first to free the VRAM space, then allocate contiguous
> VRAM space, and then move it from system memory back to VRAM.
>
> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++++++++++++++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index ef9154043757..5d118e5580ce 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1470,13 +1470,27 @@ static int amdgpu_amdkfd_gpuvm_pin_bo(struct amdgpu_bo *bo, u32 domain)
>   	if (unlikely(ret))
>   		return ret;
>   
> +	if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) {
> +		/*
> +		 * If bo is not contiguous on VRAM, move to system memory first to ensure
> +		 * we can get contiguous VRAM space after evicting other BOs.
> +		 */
> +		if (!(bo->tbo.resource->placement & TTM_PL_FLAG_CONTIGUOUS)) {
> +			ret = amdgpu_amdkfd_bo_validate(bo, AMDGPU_GEM_DOMAIN_GTT, false);

amdgpu_amdkfd_bo_validate is meant for use in kernel threads. It always 
runs uninterruptible. I believe pin_bo runs in the context of ioctls 
from user mode. So it should be interruptible.

Regards,
   Felix


> +			if (unlikely(ret)) {
> +				pr_debug("validate bo 0x%p to GTT failed %d\n", &bo->tbo, ret);
> +				goto out;
> +			}
> +		}
> +	}
> +
>   	ret = amdgpu_bo_pin_restricted(bo, domain, 0, 0);
>   	if (ret)
>   		pr_err("Error in Pinning BO to domain: %d\n", domain);
>   
>   	amdgpu_bo_sync_wait(bo, AMDGPU_FENCE_OWNER_KFD, false);
> +out:
>   	amdgpu_bo_unreserve(bo);
> -
>   	return ret;
>   }
>   


More information about the amd-gfx mailing list