[PATCH] drm/amd/amdgpu: set gtt size according to system memory size only
Michel Dänzer
michel at daenzer.net
Fri Dec 8 17:26:09 UTC 2017
On 2017-12-07 07:07 PM, Christian König wrote:
> Am 07.12.2017 um 18:34 schrieb Michel Dänzer:
>> On 2017-11-29 10:12 AM, Roger He wrote:
>>> Change-Id: Ib634375b90d875fe04a890fc82fb1e3b7112676a
>>> Signed-off-by: Roger He <Hongbo.He at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 +++-----
>>> 1 file changed, 3 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index 17bf0ce..d0661907 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -1330,11 +1330,9 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>>> struct sysinfo si;
>>> si_meminfo(&si);
>>> - gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>>> - adev->mc.mc_vram_size),
>>> - ((uint64_t)si.totalram * si.mem_unit * 3/4));
>>> - }
>>> - else
>>> + gtt_size = max(AMDGPU_DEFAULT_GTT_SIZE_MB << 20,
>>> + (uint64_t)si.totalram * si.mem_unit * 3/4);
>>> + } else
>>> gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>>> r = ttm_bo_init_mm(&adev->mman.bdev, TTM_PL_TT, gtt_size >>
>>> PAGE_SHIFT);
>>> if (r) {
>>>
>> I'm unable to finish a piglit run (using Mesa on Tonga in a Ryzen 7 1700
>> system with 16 GB of RAM) with this change. Before, I had
>>
>> [drm] amdgpu: 3072M of GTT memory ready.
>>
>> now it's
>>
>> [drm] amdgpu: 10473M of GTT memory ready.
>>
>> While running piglit, there's lots of
>>
>> [TTM] Out of kernel memory
>>
>> messages, followed by more badness, and eventually the machine becomes
>> inaccessible via SSH and has to be hard rebooted.
>>
>>
>> It occurred to me one thing not being taken into account here is that
>> system memory is also needed for storing the contents of BOs evicted
>> from VRAM. So I tried subtracting the VRAM size, resulting in
>>
>> [drm] amdgpu: 8425M of GTT memory ready.
>>
>> but the problem still happened. So I tried 1/2 instead of 3/4 of RAM,
>> resulting in
>>
>> [drm] amdgpu: 6982M of GTT memory ready.
>>
>> and was able to finish a piglit run with that.
>
> I think I know what is going on here. The max-texture-size keeps
> increasing the texture size as long as it doesn't fails to allocate one.
>
> So the "Out of kernel memory" message is actually the desired effect
> (but we should probably remove the message).
>
> The price question is what happens after that? Those code paths are
> probably not very well tested.
I'm attaching all the related dmesg output I've captured. Basically,
best case is the OOM reaper killing piglit, worst case is no response to
SSH => hard reboot.
Until this becomes more robust, this change should probably be reverted.
On 2017-12-08 02:52 AM, He, Roger wrote:
> [TTM] Out of kernel memory
> The direct reason is GTT BO swap out failure which results from more Bo allocation. And along with that need more acc_size.
> But why swap out failure I am not sure that is expected here for this case, maybe need to investigate.
FWIW, though it's probably not directly related: This happens with the
swap partition (8GB) completely unused.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kern.log.gz
Type: application/gzip
Size: 131100 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20171208/01a2d91e/attachment-0001.gz>
More information about the amd-gfx
mailing list