[PATCH] drm/amd/amdgpu: set gtt size according to system memory size only

Michel Dänzer michel at daenzer.net
Fri Dec 8 17:26:09 UTC 2017


On 2017-12-07 07:07 PM, Christian König wrote:
> Am 07.12.2017 um 18:34 schrieb Michel Dänzer:
>> On 2017-11-29 10:12 AM, Roger He wrote:
>>> Change-Id: Ib634375b90d875fe04a890fc82fb1e3b7112676a
>>> Signed-off-by: Roger He <Hongbo.He at amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 +++-----
>>>   1 file changed, 3 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index 17bf0ce..d0661907 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -1330,11 +1330,9 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>>>           struct sysinfo si;
>>>             si_meminfo(&si);
>>> -        gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>>> -                   adev->mc.mc_vram_size),
>>> -                   ((uint64_t)si.totalram * si.mem_unit * 3/4));
>>> -    }
>>> -    else
>>> +        gtt_size = max(AMDGPU_DEFAULT_GTT_SIZE_MB << 20,
>>> +            (uint64_t)si.totalram * si.mem_unit * 3/4);
>>> +    } else
>>>           gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>>>       r = ttm_bo_init_mm(&adev->mman.bdev, TTM_PL_TT, gtt_size >>
>>> PAGE_SHIFT);
>>>       if (r) {
>>>
>> I'm unable to finish a piglit run (using Mesa on Tonga in a Ryzen 7 1700
>> system with 16 GB of RAM) with this change. Before, I had
>>
>>   [drm] amdgpu: 3072M of GTT memory ready.
>>
>> now it's
>>
>>   [drm] amdgpu: 10473M of GTT memory ready.
>>
>> While running piglit, there's lots of
>>
>>   [TTM] Out of kernel memory
>>
>> messages, followed by more badness, and eventually the machine becomes
>> inaccessible via SSH and has to be hard rebooted.
>>
>>
>> It occurred to me one thing not being taken into account here is that
>> system memory is also needed for storing the contents of BOs evicted
>> from VRAM. So I tried subtracting the VRAM size, resulting in
>>
>>   [drm] amdgpu: 8425M of GTT memory ready.
>>
>> but the problem still happened. So I tried 1/2 instead of 3/4 of RAM,
>> resulting in
>>
>>   [drm] amdgpu: 6982M of GTT memory ready.
>>
>> and was able to finish a piglit run with that.
> 
> I think I know what is going on here. The max-texture-size keeps
> increasing the texture size as long as it doesn't fails to allocate one.
> 
> So the "Out of kernel memory" message is actually the desired effect
> (but we should probably remove the message).
> 
> The price question is what happens after that? Those code paths are
> probably not very well tested.

I'm attaching all the related dmesg output I've captured. Basically,
best case is the OOM reaper killing piglit, worst case is no response to
SSH => hard reboot.


Until this becomes more robust, this change should probably be reverted.


On 2017-12-08 02:52 AM, He, Roger wrote:
>                  [TTM] Out of kernel memory
> The direct reason is GTT BO swap out failure which results from more Bo allocation. And  along with that need more acc_size.
> But why swap out failure I am not sure that is expected here for this case, maybe need to investigate.

FWIW, though it's probably not directly related: This happens with the
swap partition (8GB) completely unused.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kern.log.gz
Type: application/gzip
Size: 131100 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20171208/01a2d91e/attachment-0001.gz>


More information about the amd-gfx mailing list