[PATCH] drm/amdgpu: Fix UVD contiguous CS mapping problem

Paneer Selvam, Arunpravin arunpravin.paneerselvam at amd.com
Thu Nov 14 15:50:30 UTC 2024



On 11/14/2024 9:17 PM, Christian König wrote:
> Am 14.11.24 um 16:38 schrieb Paneer Selvam, Arunpravin:
>>
>> Hi Christian,
>>
>> On 11/11/2024 3:33 PM, Christian König wrote:
>>> Am 11.11.24 um 09:05 schrieb Arunpravin Paneer Selvam:
>>>> When starting the mpv player, Radeon R9 users are observing
>>>> the below error in dmesg.
>>>>
>>>> [drm:amdgpu_uvd_cs_pass2 [amdgpu]]
>>>> *ERROR* msg/fb buffer ff00f7c000-ff00f7e000 out of 256MB segment!
>>>>
>>>> The patch tries to set the TTM_PL_FLAG_CONTIGUOUS for both user
>>>> flag(AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) set and not set cases.
>>>>
>>>> Closes:https://gitlab.freedesktop.org/drm/amd/-/issues/3599
>>>> Closes:https://gitlab.freedesktop.org/drm/amd/-/issues/3501
>>>> Signed-off-by: Arunpravin Paneer Selvam 
>>>> <Arunpravin.PaneerSelvam at amd.com>
>>>> Cc: stable at vger.kernel.org
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 18 +++++++++++-------
>>>>   1 file changed, 11 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> index d891ab779ca7..9f73f821054b 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> @@ -1801,13 +1801,17 @@ int amdgpu_cs_find_mapping(struct 
>>>> amdgpu_cs_parser *parser,
>>>>       if (dma_resv_locking_ctx((*bo)->tbo.base.resv) != 
>>>> &parser->exec.ticket)
>>>>           return -EINVAL;
>>>>   -    (*bo)->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>>>> -    amdgpu_bo_placement_from_domain(*bo, (*bo)->allowed_domains);
>>>> -    for (i = 0; i < (*bo)->placement.num_placement; i++)
>>>> -        (*bo)->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS;
>>>> -    r = ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, &ctx);
>>>> -    if (r)
>>>> -        return r;
>>>> +    if ((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) {
>>>> +        (*bo)->placements[0].flags |= TTM_PL_FLAG_CONTIGUOUS;
>>>
>>> That is a pretty clearly broken approach. (*bo)->placements[0].flags 
>>> is just used temporary between the call to 
>>> amdgpu_bo_placement_from_domain() and ttm_bo_validate().
>>>
>>> So setting the TTM_PL_FLAG_CONTIGUOUS here is certainly not correct. 
>>> Why is that necessary?
>> gitlab users reported that the buffers are out of 256MB segment, 
>> looks like buffers are not contiguous, after making the
>> contiguous allocation mandatory using the TTM_PL_FLAG_CONTIGUOUS 
>> flag, they are not seeing this issue.
>
> In that case we have a bug somewhere that we don't properly initialize 
> the flags before validating something.
>
> I fear you need to investigate further since that here clearly doesn't 
> fix the bug but just hides it.
Sure, I will check the flag initialization part.

Thanks,
Arun.
>
> Regards,
> Christian.
>
>>
>> Thanks,
>> Arun.
>>>
>>> Regards,
>>> Christian.
>>>
>>>> +    } else {
>>>> +        (*bo)->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>>>> +        amdgpu_bo_placement_from_domain(*bo, (*bo)->allowed_domains);
>>>> +        for (i = 0; i < (*bo)->placement.num_placement; i++)
>>>> +            (*bo)->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS;
>>>> +        r = ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, &ctx);
>>>> +        if (r)
>>>> +            return r;
>>>> +    }
>>>>         return amdgpu_ttm_alloc_gart(&(*bo)->tbo);
>>>>   }
>>>
>>
>



More information about the amd-gfx mailing list