[PATCH] drm/amdgpu: don't limit gtt size on apus

Joshua Ashton joshua at froggi.es
Wed Jan 6 14:18:32 UTC 2021



On 1/6/21 1:45 PM, Christian König wrote:
> Am 06.01.21 um 14:17 schrieb Joshua Ashton:
>>
>>
>> On 1/6/21 1:05 PM, Christian König wrote:
>>> Am 06.01.21 um 14:02 schrieb Bas Nieuwenhuizen:
>>>>
>>>>
>>>> On Wed, Jan 6, 2021 at 1:54 PM Christian König 
>>>> <christian.koenig at amd.com <mailto:christian.koenig at amd.com>> wrote:
>>>>
>>>>     Am 06.01.21 um 13:47 schrieb Joshua Ashton:
>>>>     >
>>>>     >
>>>>     > On 1/6/21 7:52 AM, Christian König wrote:
>>>>     >> Am 05.01.21 um 23:31 schrieb Joshua Ashton:
>>>>     >>> On 1/5/21 10:10 PM, Alex Deucher wrote:
>>>>     >>>> On Tue, Jan 5, 2021 at 5:05 PM Joshua Ashton
>>>>     <joshua at froggi.es <mailto:joshua at froggi.es>> wrote:
>>>>     >>>>>
>>>>     >>>>> Since commit 24562523688b ("Revert "drm/amd/amdgpu: set 
>>>> gtt size
>>>>     >>>>> according to system memory size only""), the GTT size was
>>>>     limited by
>>>>     >>>>> 3GiB or VRAM size.
>>>>     >>>>
>>>>     >>>> The commit in question was to fix a hang with certain tests
>>>>     on APUs.
>>>>     >>>> That should be tested again before we re-enable this.  If it
>>>>     is fixed,
>>>>     >>>> we should just revert the revert rather than special case 
>>>> dGPUs.
>>>>     >>>>
>>>>     >>>> Alex
>>>>     >>>>
>>>>     >>>
>>>>     >>> I think the commit before the revert (ba851eed895c) has some
>>>>     >>> fundamental problems:
>>>>     >>>
>>>>     >>> It was always specifying max(3GiB, 3/4ths RAM) of GTT, even if
>>>>     that
>>>>     >>> wouldn't fit into say, 1GiB or 2GiB of available RAM.
>>>>     >>>
>>>>     >>> Limiting GTT to min(max(3GiB, VRAM), 3/4ths RAM) size on dGPUs
>>>>     makes
>>>>     >>> sense also and is a sensible limit to avoid silly situations 
>>>> with
>>>>     >>> overallocation and potential OOM.
>>>>     >>>
>>>>     >>> This patch solves both of those issues.
>>>>     >>
>>>>     >> No, Alex is right this approach was already tried and it causes
>>>>     >> problems.
>>>>     >>
>>>>     >> Additional to that why should this be an issue? Even when 
>>>> VRAM is
>>>>     >> very small on APUs we still use 3GiB of GTT.
>>>>     >>
>>>>     >> Regards,
>>>>     >> Christian.
>>>>     >
>>>>     > The problem is that 3GiB of GTT isn't enough for most modern 
>>>> games.
>>>>
>>>>     You seem to misunderstand what the GTT size means here. This is the
>>>>     amount of memory an application can lock down in a single command
>>>>     submissions.
>>>>
>>>>     It is still possible for the game to use all of system memory for
>>>>     textures etc... it can just happen that some buffers are temporary
>>>>     marked as inaccessible for the GPU.
>>>>
>>>>
>>>> For Vulkan we (both RADV and AMDVLK) use GTT as the total size. 
>>>> Usage in modern games is essentially "bindless" so there is no way 
>>>> to track at a per-submission level what memory needs to be resident. 
>>>> (and even with tracking applications are allowed to use all the 
>>>> memory in a single draw call, which would be unsplittable anyway ...)
>>>
>>> Yeah, that is a really good point.
>>>
>>> The issue is that we need some limitation since 3/4 of system memory 
>>> is way to much and the max texture size test in piglit can cause a 
>>> system crash.
>>>
>>> The alternative is a better OOM handling, so that an application 
>>> which uses to much system memory through the driver stack has a more 
>>> likely chance to get killed. Cause currently that is either X or 
>>> Wayland :(
>>>
>>> Christian.
>>
>> As I understand it, what is being exposed right now is essentially 
>> max(vram size, 3GiB) limited by 3/4ths of the memory. Previously, 
>> before the revert what was being taken was just max(3GiB, 3/4ths).
>>
>> If you had < 3GiB of system memory that seems like a bit of an issue 
>> that could easily leat to OOM to me?
> 
> Not really, as I said GTT is only the memory the GPU can lock at the 
> same time. It is perfectly possible to have that larger than the 
> available system memory.
> 
> In other words this is *not* to prevent using to much system memory, for 
> this we have an additional limit inside TTM. But instead to have a 
> reasonable limit for applications to not use to much memory at the same 
> time.
> 

Worth noting that this GTT size here also affects the memory reporting 
and budgeting for applications. If the user has 1GiB of total system 
memory and 3GiB set here, then 3GiB will be the budget and size exposed 
to applications too...

(On APUs,) we really don't want to expose more GTT than system memory. 
Apps will eat into it and end up swapping or running into OOM or 
swapping *very* quickly. (I imagine this is likely what was being run 
into before the revert.)

Alternatively, in RADV and other user space drivers like AMDVLK, we 
could limit this to the system memory size or 3/4ths ourselves. Although 
that's kinda gross and I don't think that's the correct path...

>>
>> Are you hitting on something smaller than 3/4ths right now? I remember 
>> the source commit mentioned they only had 1GiB of system memory 
>> available, so that could be possible if you had a carveout of < 786MiB...
> 
> What do you mean with that? I don't have a test system at hand for this 
> if that's what you are asking for.

This was mainly a question to whoever did the revert. The question to 
find out some extra info about what they are using at the time.

- Joshie 🐸✨

> 
> Regards,
> Christian.
> 
>>
>> - Joshie 🐸✨
>>
>>>
>>>>
>>>>
>>>>     > My laptop has a 128MiB carveout which is not possible to be
>>>>     configured
>>>>     > in the BIOS so I am stuck with that size without extra kernel
>>>>     > parameters which shouldn't be necessary.
>>>>
>>>>     Did you ran into problems without the parameter?
>>>>
>>>>     >
>>>>     > If you dislike the approach of keeping the extra check for dGPUs
>>>>     and
>>>>     > limiting GTT there, then I would say that we should use
>>>>     >     gtt_size = 3/4ths system memory
>>>>     > for all devices instead of
>>>>     >     gtt_size = max(3/4ths system memory, 3GiB)
>>>>     > as it was before the revert, as it is problematic on systems 
>>>> with <
>>>>     > 3GiB of system memory.
>>>>
>>>>     Yeah, that's indeed not a good idea.
>>>>
>>>>     Regards,
>>>>     Christian.
>>>>
>>>>     >
>>>>     > - Joshie 🐸✨
>>>>     >
>>>>     >>
>>>>     >>>
>>>>     >>> - Joshie 🐸✨
>>>>     >>>
>>>>     >>>>
>>>>     >>>>>
>>>>     >>>>> This is problematic on APUs, especially with a small carveout
>>>>     >>>>> which can be as low as a fixed 128MiB, as there would be 
>>>> very a
>>>>     >>>>> limited
>>>>     >>>>> 3GiB available for video memory.
>>>>     >>>>> This obviously does not meet the demands of modern 
>>>> applications.
>>>>     >>>>>
>>>>     >>>>> This patch makes it so the GTT size heuristic always uses
>>>>     3/4ths of
>>>>     >>>>> the system memory size on APUs (limiting the size by
>>>>     3GiB/VRAM size
>>>>     >>>>> only on devices with dedicated video memory).
>>>>     >>>>>
>>>>     >>>>> Fixes: 24562523688b ("Revert drm/amd/amdgpu: set gtt size
>>>>     >>>>> according to
>>>>     >>>>> system memory size only")
>>>>     >>>>>
>>>>     >>>>> Signed-off-by: Joshua Ashton <joshua at froggi.es
>>>>     <mailto:joshua at froggi.es>>
>>>>     >>>>> ---
>>>>     >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  5 +++--
>>>>     >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 12 +++++++++---
>>>>     >>>>>   2 files changed, 12 insertions(+), 5 deletions(-)
>>>>     >>>>>
>>>>     >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>     >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>     >>>>> index 72efd579ec5e..a5a41e9272d6 100644
>>>>     >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>     >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>     >>>>> @@ -192,8 +192,9 @@ module_param_named(gartsize,
>>>>     amdgpu_gart_size,
>>>>     >>>>> uint, 0600);
>>>>     >>>>>
>>>>     >>>>>   /**
>>>>     >>>>>    * DOC: gttsize (int)
>>>>     >>>>> - * Restrict the size of GTT domain in MiB for testing. The
>>>>     >>>>> default is -1 (It's VRAM size if 3GB < VRAM < 3/4 RAM,
>>>>     >>>>> - * otherwise 3/4 RAM size).
>>>>     >>>>> + * Restrict the size of GTT domain in MiB for testing. The
>>>>     >>>>> default is -1 (On APUs this is 3/4th
>>>>     >>>>> + * of the system memory; on dGPUs this is 3GiB or VRAM 
>>>> sized,
>>>>     >>>>> whichever is bigger,
>>>>     >>>>> + * with an upper bound of 3/4th of system memory.
>>>>     >>>>>    */
>>>>     >>>>>   MODULE_PARM_DESC(gttsize, "Size of the GTT domain in
>>>>     megabytes
>>>>     >>>>> (-1 = auto)");
>>>>     >>>>>   module_param_named(gttsize, amdgpu_gtt_size, int, 0600);
>>>>     >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>     >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>     >>>>> index 4d8f19ab1014..294f26f4f310 100644
>>>>     >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>     >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>     >>>>> @@ -1865,9 +1865,15 @@ int amdgpu_ttm_init(struct 
>>>> amdgpu_device
>>>>     >>>>> *adev)
>>>>     >>>>>                  struct sysinfo si;
>>>>     >>>>>
>>>>     >>>>>                  si_meminfo(&si);
>>>>     >>>>> -               gtt_size =
>>>>     min(max((AMDGPU_DEFAULT_GTT_SIZE_MB <<
>>>>     >>>>> 20),
>>>>     >>>>> - adev->gmc.mc_vram_size),
>>>>     >>>>> - ((uint64_t)si.totalram *
>>>>     >>>>> si.mem_unit * 3/4));
>>>>     >>>>> +               gtt_size = (uint64_t)si.totalram *
>>>>     si.mem_unit * 3/4;
>>>>     >>>>> +               /* If we have dedicated memory, limit our
>>>>     GTT size to
>>>>     >>>>> +                * 3GiB or VRAM size, whichever is bigger
>>>>     >>>>> +                */
>>>>     >>>>> +               if (!(adev->flags & AMD_IS_APU)) {
>>>>     >>>>> +                       gtt_size =
>>>>     >>>>> min(max(AMDGPU_DEFAULT_GTT_SIZE_MB << 20,
>>>>     >>>>> + adev->gmc.mc_vram_size),
>>>>     >>>>> + gtt_size);
>>>>     >>>>> +               }
>>>>     >>>>>          }
>>>>     >>>>>          else
>>>>     >>>>>                  gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>>>>     >>>>> --
>>>>     >>>>> 2.30.0
>>>>     >>>>>
>>>>     >>>>> _______________________________________________
>>>>     >>>>> amd-gfx mailing list
>>>>     >>>>> amd-gfx at lists.freedesktop.org
>>>>     <mailto:amd-gfx at lists.freedesktop.org>
>>>>     >>>>>
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cchristian.koenig%40amd.com%7C90674764db6d48dd746708d8b2456bdc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637455358520008031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=V4FWrLLGDFGC2ZsrDNDmqGqsMT72Va%2F854f9jxPtqLU%3D&reserved=0 
>>>>
>>>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cchristian.koenig%40amd.com%7C90674764db6d48dd746708d8b2456bdc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637455358520018028%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8qIBp%2Bvrsato4fv3B%2BLQmbAmCDlRE4qGPi9iN8J2Nf8%3D&reserved=0> 
>>>>
>>>>
>>>>     >>>>>
>>>>     >>
>>>>
>>>>     _______________________________________________
>>>>     amd-gfx mailing list
>>>>     amd-gfx at lists.freedesktop.org 
>>>> <mailto:amd-gfx at lists.freedesktop.org>
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cchristian.koenig%40amd.com%7C90674764db6d48dd746708d8b2456bdc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637455358520018028%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8qIBp%2Bvrsato4fv3B%2BLQmbAmCDlRE4qGPi9iN8J2Nf8%3D&reserved=0 
>>>>
>>>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cchristian.koenig%40amd.com%7C90674764db6d48dd746708d8b2456bdc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637455358520018028%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8qIBp%2Bvrsato4fv3B%2BLQmbAmCDlRE4qGPi9iN8J2Nf8%3D&reserved=0> 
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cchristian.koenig%40amd.com%7C90674764db6d48dd746708d8b2456bdc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637455358520018028%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=8qIBp%2Bvrsato4fv3B%2BLQmbAmCDlRE4qGPi9iN8J2Nf8%3D&reserved=0 
>>>>
>>>
> 


More information about the amd-gfx mailing list