[PATCH] drm/amdgpu: Check resize bar register when system uses large bar

Alex Deucher alexdeucher at gmail.com
Fri Jan 5 16:11:51 UTC 2024


On Fri, Jan 5, 2024 at 9:16 AM Christian König
<ckoenig.leichtzumerken at gmail.com> wrote:
>
> Am 21.12.23 um 02:58 schrieb Ma, Jun:
> > Hi Christian,
> >
> >
> > On 12/20/2023 10:10 PM, Christian König wrote:
> >> Am 19.12.23 um 06:58 schrieb Ma Jun:
> >>> Print a warnning message if the system can't access
> >>> the resize bar register when using large bar.
> >> Well pretty clear NAK, we have embedded use cases where this would
> >> trigger an incorrect warning.
> >>
> >> What should that be good for in the first place?
> >>
> > Some customer platforms do not enable mmconfig for various reasons, such as
> > bios bug, and therefore cannot access the GPU extend configuration
> > space through mmio.
> >
> > Therefore, when the system enters the d3cold state and resumes,
> > the amdgpu driver fails to resume because the extend configuration
> > space registers of GPU can't be restored. At this point, Usually we
> > only see some failure dmesg log printed by amdgpu driver, it is
> > difficult to find the root cause.
> >
> > So I thought it would be helpful to print some warning messages at
> > the beginning to identify problems quickly.
>
> Interesting bug, but we can't do this here. We have a couple of devices
> where the REBAR cap isn't enabled for some reason (or not correctly
> enabled).
>
> In this case this would print a warning even when there isn't anything
> wrong.
>
> What we could potentially do is to check for the MSI extension, that
> should always be there if I'm not completely mistaken.
>
> But how does this hardware platform even works without the extended mmio
> space? I mean we can't even enable/disable MSI in that configuration if
> I'm not completely mistaken.

That system is probably similar to what Mario mentioned:
https://lore.kernel.org/linux-pci/20231215220343.22523-1-mario.limonciello@amd.com/

Alex

>
> Regards,
> Christian.
>
> >
> > Regards,
> > Ma Jun
> >
> >> Regards,
> >> Christian.
> >>
> >>> Signed-off-by: Ma Jun <Jun.Ma2 at amd.com>
> >>> ---
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++++++-
> >>>    1 file changed, 9 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> index 4b694696930e..e7aedb4bd66e 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> @@ -1417,6 +1417,12 @@ void amdgpu_device_wb_free(struct amdgpu_device *adev, u32 wb)
> >>>             __clear_bit(wb, adev->wb.used);
> >>>    }
> >>>
> >>> +static inline void amdgpu_find_rb_register(struct amdgpu_device *adev)
> >>> +{
> >>> +   if (!pci_find_ext_capability(adev->pdev, PCI_EXT_CAP_ID_REBAR))
> >>> +           DRM_WARN("System can't access the resize bar register,please check!!\n");
> >>> +}
> >>> +
> >>>    /**
> >>>     * amdgpu_device_resize_fb_bar - try to resize FB BAR
> >>>     *
> >>> @@ -1444,8 +1450,10 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
> >>>
> >>>     /* skip if the bios has already enabled large BAR */
> >>>     if (adev->gmc.real_vram_size &&
> >>> -       (pci_resource_len(adev->pdev, 0) >= adev->gmc.real_vram_size))
> >>> +       (pci_resource_len(adev->pdev, 0) >= adev->gmc.real_vram_size)) {
> >>> +           amdgpu_find_rb_register(adev);
> >>>             return 0;
> >>> +   }
> >>>
> >>>     /* Check if the root BUS has 64bit memory resources */
> >>>     root = adev->pdev->bus;
>


More information about the amd-gfx mailing list