[PATCH] drm/amdgpu: Check resize bar register when system uses large bar
Ma, Jun
majun at amd.com
Mon Jan 8 09:24:37 UTC 2024
Hi Christian,
On 1/5/2024 9:39 PM, Christian König wrote:
> Am 21.12.23 um 02:58 schrieb Ma, Jun:
>> Hi Christian,
>>
>>
>> On 12/20/2023 10:10 PM, Christian König wrote:
>>> Am 19.12.23 um 06:58 schrieb Ma Jun:
>>>> Print a warnning message if the system can't access
>>>> the resize bar register when using large bar.
>>> Well pretty clear NAK, we have embedded use cases where this would
>>> trigger an incorrect warning.
>>>
>>> What should that be good for in the first place?
>>>
>> Some customer platforms do not enable mmconfig for various reasons, such as
>> bios bug, and therefore cannot access the GPU extend configuration
>> space through mmio.
>>
>> Therefore, when the system enters the d3cold state and resumes,
>> the amdgpu driver fails to resume because the extend configuration
>> space registers of GPU can't be restored. At this point, Usually we
>> only see some failure dmesg log printed by amdgpu driver, it is
>> difficult to find the root cause.
>>
>> So I thought it would be helpful to print some warning messages at
>> the beginning to identify problems quickly.
>
> Interesting bug, but we can't do this here. We have a couple of devices
> where the REBAR cap isn't enabled for some reason (or not correctly
> enabled).
>
> In this case this would print a warning even when there isn't anything
> wrong.
>
> What we could potentially do is to check for the MSI extension, that
> should always be there if I'm not completely mistaken.
>
Do you mean MSI-X? There are no extended capability registers related with
MSI or MSI-x.
How about reading the 0x100 register in the extended config space since the
extended capabilities always start from the offset 0x100 according the pcie
spec.
> But how does this hardware platform even works without the extended mmio
> space? I mean we can't even enable/disable MSI in that configuration if
> I'm not completely mistaken.
I think the MSI related configuration registers are in the legacy
configuration space. So the system don't need to use mmio to access these
registers.
Regards,
Ma Jun
>
> Regards,
> Christian.
>
>>
>> Regards,
>> Ma Jun
>>
>>> Regards,
>>> Christian.
>>>
>>>> Signed-off-by: Ma Jun <Jun.Ma2 at amd.com>
>>>> ---
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++++++-
>>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 4b694696930e..e7aedb4bd66e 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -1417,6 +1417,12 @@ void amdgpu_device_wb_free(struct amdgpu_device *adev, u32 wb)
>>>> __clear_bit(wb, adev->wb.used);
>>>> }
>>>>
>>>> +static inline void amdgpu_find_rb_register(struct amdgpu_device *adev)
>>>> +{
>>>> + if (!pci_find_ext_capability(adev->pdev, PCI_EXT_CAP_ID_REBAR))
>>>> + DRM_WARN("System can't access the resize bar register,please check!!\n");
>>>> +}
>>>> +
>>>> /**
>>>> * amdgpu_device_resize_fb_bar - try to resize FB BAR
>>>> *
>>>> @@ -1444,8 +1450,10 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>>>>
>>>> /* skip if the bios has already enabled large BAR */
>>>> if (adev->gmc.real_vram_size &&
>>>> - (pci_resource_len(adev->pdev, 0) >= adev->gmc.real_vram_size))
>>>> + (pci_resource_len(adev->pdev, 0) >= adev->gmc.real_vram_size)) {
>>>> + amdgpu_find_rb_register(adev);
>>>> return 0;
>>>> + }
>>>>
>>>> /* Check if the root BUS has 64bit memory resources */
>>>> root = adev->pdev->bus;
>
More information about the amd-gfx
mailing list