[PATCH 1/9] drm/amdgpu: generally allow over-commit during BO allocation
Christian König
ckoenig.leichtzumerken at gmail.com
Mon Dec 5 13:41:04 UTC 2022
Am 25.11.22 um 19:18 schrieb Alex Deucher:
> On Fri, Nov 25, 2022 at 5:21 AM Christian König
> <ckoenig.leichtzumerken at gmail.com> wrote:
>> We already fallback to a dummy BO with no backing store when we
>> allocate GDS,GWS and OA resources and to GTT when we allocate VRAM.
>>
>> Drop all those workarounds and generalize this for GTT as well. This
>> fixes ENOMEM issues with runaway applications which try to allocate/free
>> GTT in a loop and are otherwise only limited by the CPU speed.
>>
>> The CS will wait for the cleanup of freed up BOs to satisfy the
>> various domain specific limits and so effectively throttle those
>> buggy applications down to a sane allocation behavior again.
>>
>> Signed-off-by: Christian König <christian.koenig at amd.com>
> This looks like a good bug fix and unrelated to the rest of this series.
> Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
Yeah, this was just in the tree because I tried to address some bug report.
The TTM changes mitigated the bugs, but this patch here is the real
underlying fix.
I've cherry picked it over to amd-staging-drm-next and pushed it.
Thanks,
Christian.
>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 16 +++-------------
>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +-----
>> 2 files changed, 4 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> index a0780a4e3e61..62e98f1ad770 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> @@ -113,7 +113,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
>> bp.resv = resv;
>> bp.preferred_domain = initial_domain;
>> bp.flags = flags;
>> - bp.domain = initial_domain;
>> + bp.domain = initial_domain | AMDGPU_GEM_DOMAIN_CPU;
>> bp.bo_ptr_size = sizeof(struct amdgpu_bo);
>>
>> r = amdgpu_bo_create_user(adev, &bp, &ubo);
>> @@ -332,20 +332,10 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
>> }
>>
>> initial_domain = (u32)(0xffffffff & args->in.domains);
>> -retry:
>> r = amdgpu_gem_object_create(adev, size, args->in.alignment,
>> - initial_domain,
>> - flags, ttm_bo_type_device, resv, &gobj);
>> + initial_domain, flags, ttm_bo_type_device,
>> + resv, &gobj);
>> if (r && r != -ERESTARTSYS) {
>> - if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
>> - flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>> - goto retry;
>> - }
>> -
>> - if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
>> - initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
>> - goto retry;
>> - }
>> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
>> size, initial_domain, args->in.alignment, r);
>> }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 974e85d8b6cc..919bbea2e3ac 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -581,11 +581,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>> bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
>>
>> bo->tbo.bdev = &adev->mman.bdev;
>> - if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA |
>> - AMDGPU_GEM_DOMAIN_GDS))
>> - amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
>> - else
>> - amdgpu_bo_placement_from_domain(bo, bp->domain);
>> + amdgpu_bo_placement_from_domain(bo, bp->domain);
>> if (bp->type == ttm_bo_type_kernel)
>> bo->tbo.priority = 1;
>>
>> --
>> 2.34.1
>>
More information about the dri-devel
mailing list