[PATCH] drm/amd/amdgpu: set gtt size according to system memory size only

He, Roger Hongbo.He at amd.com
Mon Dec 11 02:49:07 UTC 2017


-----Original Message-----
From: Michel Dänzer [mailto:michel at daenzer.net] 
Sent: Saturday, December 09, 2017 1:26 AM
To: Koenig, Christian <Christian.Koenig at amd.com>; He, Roger <Hongbo.He at amd.com>
Cc: amd-gfx at lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: set gtt size according to system memory size only

On 2017-12-07 07:07 PM, Christian König wrote:
> Am 07.12.2017 um 18:34 schrieb Michel Dänzer:
>> On 2017-11-29 10:12 AM, Roger He wrote:
>>> Change-Id: Ib634375b90d875fe04a890fc82fb1e3b7112676a
>>> Signed-off-by: Roger He <Hongbo.He at amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 +++-----
>>>   1 file changed, 3 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index 17bf0ce..d0661907 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -1330,11 +1330,9 @@ int amdgpu_ttm_init(struct amdgpu_device 
>>> *adev)
>>>           struct sysinfo si;
>>>             si_meminfo(&si);
>>> -        gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>>> -                   adev->mc.mc_vram_size),
>>> -                   ((uint64_t)si.totalram * si.mem_unit * 3/4));
>>> -    }
>>> -    else
>>> +        gtt_size = max(AMDGPU_DEFAULT_GTT_SIZE_MB << 20,
>>> +            (uint64_t)si.totalram * si.mem_unit * 3/4);
>>> +    } else
>>>           gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>>>       r = ttm_bo_init_mm(&adev->mman.bdev, TTM_PL_TT, gtt_size >> 
>>> PAGE_SHIFT);
>>>       if (r) {
>>>
>> I'm unable to finish a piglit run (using Mesa on Tonga in a Ryzen 7 
>> 1700 system with 16 GB of RAM) with this change. Before, I had
>>
>>   [drm] amdgpu: 3072M of GTT memory ready.
>>
>> now it's
>>
>>   [drm] amdgpu: 10473M of GTT memory ready.
>>
>> While running piglit, there's lots of
>>
>>   [TTM] Out of kernel memory
>>
>> messages, followed by more badness, and eventually the machine 
>> becomes inaccessible via SSH and has to be hard rebooted.
>>
>>
>> It occurred to me one thing not being taken into account here is that 
>> system memory is also needed for storing the contents of BOs evicted 
>> from VRAM. So I tried subtracting the VRAM size, resulting in
>>
>>   [drm] amdgpu: 8425M of GTT memory ready.
>>
>> but the problem still happened. So I tried 1/2 instead of 3/4 of RAM, 
>> resulting in
>>
>>   [drm] amdgpu: 6982M of GTT memory ready.
>>
>> and was able to finish a piglit run with that.
> 
> I think I know what is going on here. The max-texture-size keeps 
> increasing the texture size as long as it doesn't fails to allocate one.
> 
> So the "Out of kernel memory" message is actually the desired effect 
> (but we should probably remove the message).
> 
> The price question is what happens after that? Those code paths are 
> probably not very well tested.

I'm attaching all the related dmesg output I've captured. Basically, best case is the OOM reaper killing piglit, worst case is no response to SSH => hard reboot.


	 Until this becomes more robust, this change should probably be reverted.

I have not got any valuable info from error log.
Please revert it now as quick solution, later let me check the reason. 


Thanks
Roger(Hongbo.He)

On 2017-12-08 02:52 AM, He, Roger wrote:
>                  [TTM] Out of kernel memory The direct reason is GTT 
> BO swap out failure which results from more Bo allocation. And  along with that need more acc_size.
> But why swap out failure I am not sure that is expected here for this case, maybe need to investigate.

FWIW, though it's probably not directly related: This happens with the swap partition (8GB) completely unused.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer


More information about the amd-gfx mailing list