[PATCH 10/16] drm/amdgpu: validate doorbell read/write

Shashank Sharma shashank.sharma at amd.com
Thu Mar 30 15:02:54 UTC 2023


On 30/03/2023 16:49, Luben Tuikov wrote:
> On 2023-03-30 10:37, Shashank Sharma wrote:
>> On 30/03/2023 16:34, Luben Tuikov wrote:
>>> On 2023-03-29 11:47, Shashank Sharma wrote:
>>>> This patch:
>>>> - updates start/end values for each of the doorbell object
>>>>     created.
>>>> - adds a function which validates that the kernel doorbell read/write
>>>>     is within this range.
>>>> - uses this function during doorbell writes from kernel.
>>>>
>>>> Cc: Alex Deucher <alexander.deucher at amd.com>
>>>> Cc: Christian Koenig <christian.koenig at amd.com>
>>>> Signed-off-by: Shashank Sharma <shashank.sharma at amd.com>
>>>> Signed-off-by: Arvind Yadav <arvind.yadav at amd.com>
>>>> ---
>>>>    .../gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c  | 29 ++++++++++++++++---
>>>>    1 file changed, 25 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c
>>>> index b46fe8b1378d..81713b2c28e1 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c
>>>> @@ -22,6 +22,25 @@
>>>>     */
>>>>    
>>>>    #include "amdgpu.h"
>>>> +#include "kfd_priv.h"
>>>> +
>>>> +static inline
>>>> +bool amdgpu_doorbell_valid(struct amdgpu_device *adev, u32 index)
>>>> +{
>>>> +	if (index >= adev->doorbell.kernel_doorbells.start &&
>>>> +	    index < adev->doorbell.kernel_doorbells.end)
>>>> +		return true;
>>>> +
>>>> +	if (index >= adev->mes.kernel_doorbells.start &&
>>>> +	    index < adev->mes.kernel_doorbells.end)
>>>> +		return true;
>>>> +
>>>> +	if (index >= adev->kfd.dev->kernel_doorbells.start &&
>>>> +	    index < adev->kfd.dev->kernel_doorbells.end)
>>>> +		return true;
>>>> +
>>>> +	return false;
>>>> +}
>>> Here you're excluding "end".
>>>
>>> In patch 7 we see this:
>>>> +	/* Last index in this object */
>>>> +	uint32_t end;
>>> Which implies that "end" is included, but in this patch, the code excludes it.
>>> Perhaps you intended to use "index <= ...end" here?
>> No, this is intended, same as array object calculation.
>>
>> end = start + size;
>>
>> max = start + size - 1
> This I understand, but "end" is NEVER "start + size" in all
> code written since 1969. "end" is outside the bounds and thus
> never used like that.
> "start" and "end" usage comes from RTL and is always inclusive,
> and "end" always fits in the same sized register as that of "start".
> But if you use "size" and add, it may overflow. So, enough history.
>
> "end" is inclusive. If this is not the case in your implementation,
> then please use "size".
>
>> so (< end) not (<= end)
>>
>> end says last index in this doorbell range;
> This I don't understand.
>
> This isn't how "start" and "end" are being used.
> Their usage comes from RTL, and is always inclusive.
>
> Either use "start" and "size" or make "end" be inclusive.
>
> I'd prefer using "start" and "size" as this is traditionally
> what is done in memory management in software (not RTL).
>
> However using "end" in software makes it tricky to calculate
> size, and one always does "end-start+1", and this could lead
> to bugs and errors.
>
> Please use "start" and "size", then.

I think I am getting your point here, it should be end inclusive in this 
case, noted.

- Shashank


>
> Regards,
> Luben
>
>
>> - Shashank
>>
>>> Since this isn't RTL, wouldn't it be better to describe the doorbell instance,
>>> with a "start" and "size"? This is traditionally used in memory management,
>>> and it makes comparisons and checks easy.
>>>
>>> Regards,
>>> Luben
>>>
>>>
>>>>    
>>>>    /**
>>>>     * amdgpu_mm_rdoorbell - read a doorbell dword
>>>> @@ -37,7 +56,7 @@ u32 amdgpu_mm_rdoorbell(struct amdgpu_device *adev, u32 index)
>>>>    	if (amdgpu_device_skip_hw_access(adev))
>>>>    		return 0;
>>>>    
>>>> -	if (index < adev->doorbell.num_kernel_doorbells) {
>>>> +	if (amdgpu_doorbell_valid(adev, index)) {
>>>>    		return readl(adev->doorbell.ptr + index);
>>>>    	} else {
>>>>    		DRM_ERROR("reading beyond doorbell aperture: 0x%08x!\n", index);
>>>> @@ -60,7 +79,7 @@ void amdgpu_mm_wdoorbell(struct amdgpu_device *adev, u32 index, u32 v)
>>>>    	if (amdgpu_device_skip_hw_access(adev))
>>>>    		return;
>>>>    
>>>> -	if (index < adev->doorbell.num_kernel_doorbells) {
>>>> +	if (amdgpu_doorbell_valid(adev, index)) {
>>>>    		writel(v, adev->doorbell.ptr + index);
>>>>    	} else {
>>>>    		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>> @@ -81,7 +100,7 @@ u64 amdgpu_mm_rdoorbell64(struct amdgpu_device *adev, u32 index)
>>>>    	if (amdgpu_device_skip_hw_access(adev))
>>>>    		return 0;
>>>>    
>>>> -	if (index < adev->doorbell.num_kernel_doorbells) {
>>>> +	if (amdgpu_doorbell_valid(adev, index)) {
>>>>    		return atomic64_read((atomic64_t *)(adev->doorbell.ptr + index));
>>>>    	} else {
>>>>    		DRM_ERROR("reading beyond doorbell aperture: 0x%08x!\n", index);
>>>> @@ -104,7 +123,7 @@ void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v)
>>>>    	if (amdgpu_device_skip_hw_access(adev))
>>>>    		return;
>>>>    
>>>> -	if (index < adev->doorbell.num_kernel_doorbells) {
>>>> +	if (amdgpu_doorbell_valid(adev, index)) {
>>>>    		atomic64_set((atomic64_t *)(adev->doorbell.ptr + index), v);
>>>>    	} else {
>>>>    		DRM_ERROR("writing beyond doorbell aperture: 0x%08x!\n", index);
>>>> @@ -157,6 +176,8 @@ int amdgpu_doorbell_alloc_page(struct amdgpu_device *adev,
>>>>    		return r;
>>>>    	}
>>>>    
>>>> +	db_obj->start = amdgpu_doorbell_index_on_bar(adev, db_obj->bo, 0);
>>>> +	db_obj->end = db_obj->start + db_obj->size / sizeof(u32);
>>>>    	return 0;
>>>>    }
>>>>    


More information about the amd-gfx mailing list