[PATCH] drm/amdgpu: Check resize bar register when system uses large bar

Ma, Jun majun at amd.com
Tue Jan 9 09:44:11 UTC 2024



On 1/9/2024 12:24 AM, Christian König wrote:
> Am 08.01.24 um 10:24 schrieb Ma, Jun:
>> Hi Christian,
>>
>> On 1/5/2024 9:39 PM, Christian König wrote:
>>> Am 21.12.23 um 02:58 schrieb Ma, Jun:
>>>> Hi Christian,
>>>>
>>>>
>>>> On 12/20/2023 10:10 PM, Christian König wrote:
>>>>> Am 19.12.23 um 06:58 schrieb Ma Jun:
>>>>>> Print a warnning message if the system can't access
>>>>>> the resize bar register when using large bar.
>>>>> Well pretty clear NAK, we have embedded use cases where this would
>>>>> trigger an incorrect warning.
>>>>>
>>>>> What should that be good for in the first place?
>>>>>
>>>> Some customer platforms do not enable mmconfig for various reasons, such as
>>>> bios bug, and therefore cannot access the GPU extend configuration
>>>> space through mmio.
>>>>
>>>> Therefore, when the system enters the d3cold state and resumes,
>>>> the amdgpu driver fails to resume because the extend configuration
>>>> space registers of GPU can't be restored. At this point, Usually we
>>>> only see some failure dmesg log printed by amdgpu driver, it is
>>>> difficult to find the root cause.
>>>>
>>>> So I thought it would be helpful to print some warning messages at
>>>> the beginning to identify problems quickly.
>>> Interesting bug, but we can't do this here. We have a couple of devices 
>>> where the REBAR cap isn't enabled for some reason (or not correctly 
>>> enabled).
>>>
>>> In this case this would print a warning even when there isn't anything 
>>> wrong.
>>>
>>> What we could potentially do is to check for the MSI extension, that 
>>> should always be there if I'm not completely mistaken.
>>>
>> Do you mean MSI-X? There are no extended capability registers related with
>> MSI or MSI-x.
>>
>> How about reading the 0x100 register in the extended config space since the
>> extended capabilities always start from the offset 0x100 according the pcie
>> spec.
> 
> Yeah, that should work as well. At least some extension should be there in the extended config space.
> 
Ok, I'll submit a new patch.

Regards,
Ma Jun
>>> But how does this hardware platform even works without the extended mmio 
>>> space? I mean we can't even enable/disable MSI in that configuration if 
>>> I'm not completely mistaken.
>> I think the MSI related configuration registers are in the legacy
>> configuration space. So the system don't need to use mmio to access these
>> registers.
> 
> Ah, yes that could explain it.
> 
> Thanks,
> Christian.
> 
>> Regards,
>> Ma Jun
>>
>>> Regards,
>>> Christian.
>>>
>>>> Regards,
>>>> Ma Jun
>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Ma Jun <Jun.Ma2 at amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++++++-
>>>>>>    1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index 4b694696930e..e7aedb4bd66e 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -1417,6 +1417,12 @@ void amdgpu_device_wb_free(struct amdgpu_device *adev, u32 wb)
>>>>>>    		__clear_bit(wb, adev->wb.used);
>>>>>>    }
>>>>>>    
>>>>>> +static inline void amdgpu_find_rb_register(struct amdgpu_device *adev)
>>>>>> +{
>>>>>> +	if (!pci_find_ext_capability(adev->pdev, PCI_EXT_CAP_ID_REBAR))
>>>>>> +		DRM_WARN("System can't access the resize bar register,please check!!\n");
>>>>>> +}
>>>>>> +
>>>>>>    /**
>>>>>>     * amdgpu_device_resize_fb_bar - try to resize FB BAR
>>>>>>     *
>>>>>> @@ -1444,8 +1450,10 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>>>>>>    
>>>>>>    	/* skip if the bios has already enabled large BAR */
>>>>>>    	if (adev->gmc.real_vram_size &&
>>>>>> -	    (pci_resource_len(adev->pdev, 0) >= adev->gmc.real_vram_size))
>>>>>> +	    (pci_resource_len(adev->pdev, 0) >= adev->gmc.real_vram_size)) {
>>>>>> +		amdgpu_find_rb_register(adev);
>>>>>>    		return 0;
>>>>>> +	}
>>>>>>    
>>>>>>    	/* Check if the root BUS has 64bit memory resources */
>>>>>>    	root = adev->pdev->bus;
> 


More information about the amd-gfx mailing list