amd/amdkfd: Fix a memory limit issue
Limonciello, Mario
mario.limonciello at amd.com
Mon Nov 14 20:58:35 UTC 2022
On 11/14/2022 12:45, Eric Huang wrote:
> It is to resolve a regression, which fails to allocate
> VRAM due to no free memory in application, the reason
> is we add check of vram_pin_size for memory limit, and
> application is pinning the memory for Peerdirect, KFD
> should not count it in memory limit. So removing
> vram_pin_size will resolve it.
Any idea when the regression was introduced? Could you narrow it down
to a commit?
If so, it would be great to include a "Fixes" tag so that this could
also backport to relevant stable kernels that have the regression.
>
> Signed-off-by: Eric Huang <jinhuieric.huang at amd.com>
> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index db772942f7a6..fb1bb593312e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -172,9 +172,7 @@ int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,
> (kfd_mem_limit.ttm_mem_used + ttm_mem_needed >
> kfd_mem_limit.max_ttm_mem_limit) ||
> (adev && adev->kfd.vram_used + vram_needed >
> - adev->gmc.real_vram_size -
> - atomic64_read(&adev->vram_pin_size) -
> - reserved_for_pt)) {
> + adev->gmc.real_vram_size - reserved_for_pt)) {
> ret = -ENOMEM;
> goto release;
> }
More information about the amd-gfx
mailing list