[PATCH] drm/buddy: Fix alloc_range() error handling code
Matthew Auld
matthew.auld at intel.com
Thu Feb 8 14:44:12 UTC 2024
On 08/02/2024 14:17, Matthew Auld wrote:
> On 08/02/2024 13:47, Arunpravin Paneer Selvam wrote:
>> Hi Matthew,
>>
>> On 2/8/2024 7:00 PM, Matthew Auld wrote:
>>> On 07/02/2024 17:44, Arunpravin Paneer Selvam wrote:
>>>> Few users have observed display corruption when they boot
>>>> the machine to KDE Plasma or playing games. We have root
>>>> caused the problem that whenever alloc_range() couldn't
>>>> find the required memory blocks the function was returning
>>>> SUCCESS in some of the corner cases.
>>>
>>> Can you please give an example here?
>>>
>> In the try hard contiguous allocation, for example the requested
>> memory is 1024 pages,
>> it might go and pick the highest and last block (of size 512 pages) in
>> the freelist where
>> there are no more space exist in the total address range. In this kind
>> of corner case,
>> alloc_range was returning success though the allocated size is less
>> than the requested size.
>> Hence in try_hard_contiguous_allocation, we will not proceed to the
>> LHS allocation and
>> we return only with the RHS allocation having only the 512 pages of
>> allocation. This
>> leads to display corruption in many use cases (I think mainly when
>> requested for contiguous huge buffer)
>> mainly on APU platforms.
>
> Ok, I guess other thing is doing:
>
> lhs_offset = drm_buddy_block_offset(block) - lhs_size;
>
> I presume it's possible for block_offset < lhs_size here, which might be
> funny?
I think would also be good to add some basic unit test here:
https://patchwork.freedesktop.org/patch/577497/?series=129671&rev=1
Test passes with your patch, and ofc fails without it.
Just the question of the lhs_offset above,
Reviewed-by: Matthew Auld <matthew.auld at intel.com>
>
>>
>> Thanks,
>> Arun.
>>>>
>>>> The right approach would be if the total allocated size
>>>> is less than the required size, the function should
>>>> return -ENOSPC.
>>>>
>>>> Gitlab ticket link -
>>>> https://gitlab.freedesktop.org/drm/amd/-/issues/3097
>>>> Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
>>>> Signed-off-by: Arunpravin Paneer Selvam
>>>> <Arunpravin.PaneerSelvam at amd.com>
>>>> Tested-by: Mario Limonciello <mario.limonciello at amd.com>
>>>> ---
>>>> drivers/gpu/drm/drm_buddy.c | 6 ++++++
>>>> 1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
>>>> index f57e6d74fb0e..c1a99bf4dffd 100644
>>>> --- a/drivers/gpu/drm/drm_buddy.c
>>>> +++ b/drivers/gpu/drm/drm_buddy.c
>>>> @@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
>>>> } while (1);
>>>> list_splice_tail(&allocated, blocks);
>>>> +
>>>> + if (total_allocated < size) {
>>>> + err = -ENOSPC;
>>>> + goto err_free;
>>>> + }
>>>> +
>>>> return 0;
>>>> err_undo:
>>
More information about the Intel-gfx
mailing list