[PATCH] drm/amdgpu: try allowed domain when pin framebuffer failed

Christian König ckoenig.leichtzumerken at gmail.com
Thu Dec 8 08:19:02 UTC 2022


Am 08.12.22 um 05:40 schrieb Alex Deucher:
> On Wed, Dec 7, 2022 at 11:35 PM Zhang, Yifan <Yifan1.Zhang at amd.com> wrote:
>> [AMD Official Use Only - General]
>>
>> We encountered some issues in recent APUs when tried to pin a large framebuffer (e.g. 64MB w/ dual 4K display), switch to display SG could resolve such issue.  Actually we received various kinds of VRAM shortage issues recently, there is more and more pressure on APU 512MB VRAM as FWs reserve more memory, buddy system in 5.19 creates more fragment and multiple 4k display scenario is used  more often.. Since there is no difference b/w access VRAM and System memory in APUs from HW perspective, I think we can switch some of framebuffers to system memory to mitigate VRAM pressure.
>>
>> [   52.798705] [TTM] Failed to find memory space for buffer 0x00000000833a4c59 eviction
>> [   52.798707] [TTM]  No space for 00000000833a4c59 (16470 pages, 65880K, 64M)
>> [   52.798788] amdgpu 0000:e2:00.0: amdgpu: 000000003dbf313e pin failed
>> [   52.798790] [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin framebuffer with error -12
> But from the patch:
> -    if (domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) {
> +    if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> +        ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type ==
> CHIP_STONEY))) {
>           domain = AMDGPU_GEM_DOMAIN_VRAM;
>           if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
>               domain = AMDGPU_GEM_DOMAIN_GTT;
>
> AMDGPU_SG_THRESHOLD is only used in this one place which only applies
> to CZ and ST.  There are not likely to be new CZ or ST boards any time
> soon so I don't think it matters for newer APUs if we change
> AMDGPU_SG_THRESHOLD.

Yeah, agree completely. We shouldn't change the threshold because of the 
hw limitations on CZ/ST.

Maybe we should even force the BOs into GTT on those boards to avoid 
using both VRAM and GTT at the same time, but not sure about that.

Regards,
Christian.

>
> Does the patch fix the issues you are seeing?
>
> Alex
>
>> Best Regards,
>> Yifan
>>
>> -----Original Message-----
>> From: Alex Deucher <alexdeucher at gmail.com>
>> Sent: Thursday, December 8, 2022 12:19 PM
>> To: Zhang, Yifan <Yifan1.Zhang at amd.com>
>> Cc: Christian König <ckoenig.leichtzumerken at gmail.com>; Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; amd-gfx <amd-gfx-bounces at lists.freedesktop.org>; Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam at amd.com>; amd-gfx at lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
>> Subject: Re: [PATCH] drm/amdgpu: try allowed domain when pin framebuffer failed
>>
>> On Wed, Dec 7, 2022 at 11:10 PM Zhang, Yifan <Yifan1.Zhang at amd.com> wrote:
>>> [AMD Official Use Only - General]
>>>
>>> Hi Alex,
>>>
>>> We need to adjust the AMDGPU_SG_THRESHOLD as well since recent APUs are configured w/ 512MB VRAM. Pls check attached patch.
>> Why do we need to increase this threshold?  The condition only applies to CZ and ST, not more recent APUs.
>>
>> Alex
>>
>>> Best Regards,
>>> Yifan
>>>
>>> -----Original Message-----
>>> From: Alex Deucher <alexdeucher at gmail.com>
>>> Sent: Thursday, December 8, 2022 12:21 AM
>>> To: Christian König <ckoenig.leichtzumerken at gmail.com>
>>> Cc: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; Zhang, Yifan
>>> <Yifan1.Zhang at amd.com>; amd-gfx
>>> <amd-gfx-bounces at lists.freedesktop.org>; Paneer Selvam, Arunpravin
>>> <Arunpravin.PaneerSelvam at amd.com>; amd-gfx at lists.freedesktop.org;
>>> Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian
>>> <Christian.Koenig at amd.com>
>>> Subject: Re: [PATCH] drm/amdgpu: try allowed domain when pin
>>> framebuffer failed
>>>
>>> On Wed, Dec 7, 2022 at 11:10 AM Christian König <ckoenig.leichtzumerken at gmail.com> wrote:
>>>>
>>>>
>>>> Am 07.12.22 um 17:08 schrieb Alex Deucher:
>>>>> On Wed, Dec 7, 2022 at 10:52 AM Christian König
>>>>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>>>>> Am 07.12.22 um 16:38 schrieb Alex Deucher:
>>>>>>> On Wed, Dec 7, 2022 at 10:23 AM Christian König
>>>>>>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>>>>>>> I would go a step further and just allow GTT domain on ASICs !=
>>>>>>>> CARRIZO
>>>>>>>> | STONEY.
>>>>>>>>
>>>>>>>> I can't see a good reason we should still have any limitation
>>>>>>>> here, VRAM doesn't have any advantage any more as far as I know.
>>>>>>> Well, if VRAM is available we want to make sure someone uses it
>>>>>>> otherwise it's just wasted.
>>>>>> Well it still gets used when it's free. So now problem at all here.
>>>>>>
>>>>>> We should just not force anything into VRAM or GTT any more if
>>>>>> it's technically not necessary.
>>>>> So just this?
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> index 919bbea2e3ac..8e8f07fa7a93 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> @@ -1506,7 +1506,7 @@ u64 amdgpu_bo_gpu_offset_no_check(struct amdgpu_bo *bo)
>>>>>    uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
>>>>>                                               uint32_t domain)
>>>>>    {
>>>>> -       if (domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) {
>>>> We still need to keep this check to avoid trying to adjust VRAM only
>>>> allocations (the cursor still needs this IIRC).
>>>>
>>>> Apart from that I think that should work.
>>> Attached.  Thanks,
>>>
>>> Alex
>>>
>>>> Christian.
>>>>
>>>>> +       if ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type
>>>>> + ==
>>>>> CHIP_STONEY)) {
>>>>>                   domain = AMDGPU_GEM_DOMAIN_VRAM;
>>>>>                   if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
>>>>>                           domain = AMDGPU_GEM_DOMAIN_GTT;
>>>>>
>>>>>
>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> Alex
>>>>>>>
>>>>>>>
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>> Am 07.12.22 um 16:10 schrieb Alex Deucher:
>>>>>>>>> Does this patch fix the problem?
>>>>>>>>>
>>>>>>>>> Alex
>>>>>>>>>
>>>>>>>>> On Wed, Dec 7, 2022 at 2:27 AM Zhang, Jesse(Jie) <Jesse.Zhang at amd.com> wrote:
>>>>>>>>>> [AMD Official Use Only - General]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>         drm/amdgpu: try allowed domain when pin framebuffer failed.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>         [WHY&HOW]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>         in some scenarios, the allocate memory often failed. such as do hot plug or play games.
>>>>>>>>>>
>>>>>>>>>>         so we can try allowed domain, if the preferred domain cannot allocate memory.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>         Signed-off-by: jie1zhan jesse.zhang at amd.com
>>>>>>>>>>
>>>>>>>>>>         Change-Id: I4b62e2ff072d02c515f901000a5789339d481273
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>>
>>>>>>>>>> index 1ae0c8723348..05fcaf7f9d92 100644
>>>>>>>>>>
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>>
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>>
>>>>>>>>>> @@ -39,6 +39,7 @@
>>>>>>>>>>
>>>>>>>>>> #include "amdgpu.h"
>>>>>>>>>>
>>>>>>>>>> #include "amdgpu_trace.h"
>>>>>>>>>>
>>>>>>>>>> #include "amdgpu_amdkfd.h"
>>>>>>>>>>
>>>>>>>>>> +#include "amdgpu_display.h"
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> /**
>>>>>>>>>>
>>>>>>>>>>       * DOC: amdgpu_object
>>>>>>>>>>
>>>>>>>>>> @@ -942,8 +943,14 @@ int amdgpu_bo_pin_restricted(struct
>>>>>>>>>> amdgpu_bo *bo, u32 domain,
>>>>>>>>>>
>>>>>>>>>>                             bo->placements[i].lpfn = lpfn;
>>>>>>>>>>
>>>>>>>>>>             }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> +       retry:
>>>>>>>>>>
>>>>>>>>>>             r = ttm_bo_validate(&bo->tbo, &bo->placement,
>>>>>>>>>> &ctx);
>>>>>>>>>>
>>>>>>>>>>             if (unlikely(r)) {
>>>>>>>>>>
>>>>>>>>>> +               //try allowed domain when pin failed. just a workaround.
>>>>>>>>>>
>>>>>>>>>> +               if (unlikely(r == -ENOMEM) && domain !=
>>>>>>>>>> + bo->allowed_domains) {
>>>>>>>>>>
>>>>>>>>>> +                       amdgpu_bo_placement_from_domain(bo,
>>>>>>>>>> + bo->allowed_domains);
>>>>>>>>>>
>>>>>>>>>> +                       goto retry;
>>>>>>>>>>
>>>>>>>>>> +               }
>>>>>>>>>>
>>>>>>>>>>                     dev_err(adev->dev, "%p pin failed\n", bo);
>>>>>>>>>>
>>>>>>>>>>                     goto error;
>>>>>>>>>>
>>>>>>>>>>             }



More information about the amd-gfx mailing list