[Intel-gfx] [PATCH 1/9] drm/amdgpu: generally allow over-commit during BO allocation
Felix Kuehling
felix.kuehling at amd.com
Sat Dec 10 06:15:26 UTC 2022
On 2022-11-25 05:21, Christian König wrote:
> We already fallback to a dummy BO with no backing store when we
> allocate GDS,GWS and OA resources and to GTT when we allocate VRAM.
>
> Drop all those workarounds and generalize this for GTT as well. This
> fixes ENOMEM issues with runaway applications which try to allocate/free
> GTT in a loop and are otherwise only limited by the CPU speed.
>
> The CS will wait for the cleanup of freed up BOs to satisfy the
> various domain specific limits and so effectively throttle those
> buggy applications down to a sane allocation behavior again.
>
> Signed-off-by: Christian König <christian.koenig at amd.com>
This patch causes some regressions in KFDTest. KFDMemoryTest.MMBench
sees a huge VRAM allocation slow-down. And
KFDMemoryTest.LargestVramBufferTest can only allocate half the available
memory.
This seems to be caused by initially validating VRAM BOs in the CPU
domain, which allocates a ttm_tt. A subsequent validation in the VRAM
domain involves a copy from GTT to VRAM.
After that, freeing of BOs can get delayed by the ghost object of a
previous migration, which delays calling release notifiers and causes
problems for KFDs available memory accounting.
I experimented with a workaround that validates BOs immediately after
allocation, but that only moves around the delays and doesn't solve the
problem. During those experiments I may also have stumbled over a bug in
ttm_buffer_object_transfer: It calls ttm_bo_set_bulk_move before
initializing and locking fbo->base.base._resv. This results in a flood
of warnings because ttm_bo_set_bulk_move expects the reservation to be
locked.
Right now I'd like to remove the bp.domain = initial_domain |
AMDGPU_GEM_DOMAIN_CPU change in amdgpu_gem_object_create to fix this.
Regards,
Felix
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 16 +++-------------
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +-----
> 2 files changed, 4 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index a0780a4e3e61..62e98f1ad770 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -113,7 +113,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
> bp.resv = resv;
> bp.preferred_domain = initial_domain;
> bp.flags = flags;
> - bp.domain = initial_domain;
> + bp.domain = initial_domain | AMDGPU_GEM_DOMAIN_CPU;
> bp.bo_ptr_size = sizeof(struct amdgpu_bo);
>
> r = amdgpu_bo_create_user(adev, &bp, &ubo);
> @@ -332,20 +332,10 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
> }
>
> initial_domain = (u32)(0xffffffff & args->in.domains);
> -retry:
> r = amdgpu_gem_object_create(adev, size, args->in.alignment,
> - initial_domain,
> - flags, ttm_bo_type_device, resv, &gobj);
> + initial_domain, flags, ttm_bo_type_device,
> + resv, &gobj);
> if (r && r != -ERESTARTSYS) {
> - if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
> - flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> - goto retry;
> - }
> -
> - if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
> - initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
> - goto retry;
> - }
> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
> size, initial_domain, args->in.alignment, r);
> }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 974e85d8b6cc..919bbea2e3ac 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -581,11 +581,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
> bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
>
> bo->tbo.bdev = &adev->mman.bdev;
> - if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA |
> - AMDGPU_GEM_DOMAIN_GDS))
> - amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
> - else
> - amdgpu_bo_placement_from_domain(bo, bp->domain);
> + amdgpu_bo_placement_from_domain(bo, bp->domain);
> if (bp->type == ttm_bo_type_kernel)
> bo->tbo.priority = 1;
>
More information about the Intel-gfx
mailing list