[PATCH] drm/amd/display: Fix deadlock with display during hanged ring recovery.

Grodzovsky, Andrey Andrey.Grodzovsky at amd.com
Thu Feb 14 19:03:55 UTC 2019


On 2/14/19 12:54 PM, Kazlauskas, Nicholas wrote:
> On 2/14/19 12:47 PM, Grodzovsky, Andrey wrote:
>> On 2/14/19 11:57 AM, Kazlauskas, Nicholas wrote:
>>> On 2/14/19 11:47 AM, Grodzovsky, Andrey wrote:
>>>> On 2/14/19 11:07 AM, Michel Dänzer wrote:
>>>>> On 2019-02-14 4:54 p.m., Kazlauskas, Nicholas wrote:
>>>>>> On 2/14/19 10:42 AM, Grodzovsky, Andrey wrote:
>>>>>>> On 2/14/19 4:05 AM, Christian König wrote:
>>>>>>>> Am 13.02.19 um 19:58 schrieb Andrey Grodzovsky:
>>>>>>>>> When ring hang happens amdgpu_dm_commit_planes during flip is holding
>>>>>>>>> the BO reserved and then stack waiting for fences to signal in
>>>>>>>>> reservation_object_wait_timeout_rcu (which won't signal because there
>>>>>>>>> was a hnag). Then when we try to shutdown display block during reset
>>>>>>>>> recovery from drm_atomic_helper_suspend we also try to reserve the BO
>>>>>>>>> from dm_plane_helper_cleanup_fb ending in deadlock.
>>>>>>>>> Also remove useless WARN_ON
>>>>>>>> Well it is good that you pointed this out, but there are more problems
>>>>>>>> than just waiting wile the BO is reserved here.
>>>>>>>>
>>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
>>>>>>>>> ---
>>>>>>>>>         drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19
>>>>>>>>> +++++++++++++------
>>>>>>>>>         1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>>>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>>>>> index acc4ff8..f8dec36 100644
>>>>>>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>>>>> @@ -4802,14 +4802,21 @@ static void amdgpu_dm_commit_planes(struct
>>>>>>>>> drm_atomic_state *state,
>>>>>>>>>                      */
>>>>>>>>>                     abo = gem_to_amdgpu_bo(fb->obj[0]);
>>>>>>>>>                     r = amdgpu_bo_reserve(abo, true);
>>>>>>>> Well why do we reserve the BO in the first place? As the name
>>>>>>>> indicates reservation_object_wait_timeout_rcu() just uses rcu to wait
>>>>>>>> for the BO to be idle, no need to actually reserve it at all.
>>>>>>> That a good point, I honestly can't remember any more why it's here...
>>>>>>> It does look unnecessary given that we are not validating the BO here.
>>>>>> I think we reserve it to grab the tiling flags to update the plane
>>>>>> address (and some other bits of the plane state with a patch I'm
>>>>>> currently working on). I'm guessing that we still need the reservation
>>>>>> so nothing else tries to grab it after it becomes idle, but I'm not
>>>>>> overly familiar with that bit.
>>>>> The BO is already pinned at this point, isn't it? That's enough to
>>>>> prevent it from moving, no reservation needed.
>>>> Yes, agree.
>>>>
>>>> Andrey
>>>>
>>>>
>>> I'm not worried about the buffer moving so much as I am about userspace
>>> changing the tiling flags by calling the IOCTL. Can that happen when
>>> it's pinned?
>>>
>>> Nicholas Kazlauskas
>> In any case this has nothing to do with reserving the BO.
>>
>> Andrey
>>
>>
> Technically it does? See the function:
>
> void amdgpu_bo_get_tiling_flags(struct amdgpu_bo *bo, u64 *tiling_flags)
> {
> 	lockdep_assert_held(&bo->tbo.resv->lock.base);
>
> 	if (tiling_flags)
> 		*tiling_flags = bo->tiling_flags;
> }
>
> Nicholas Kazlauskas

What I meant that it has nothing to do with this BO reservation in the 
flip code while waiting for fences. Seems it's dafe to remove it.

Andrey




More information about the amd-gfx mailing list