[PATCH] drm/amd/display: Fix deadlock with display during hanged ring recovery.

Michel Dänzer michel at daenzer.net
Thu Feb 14 17:31:18 UTC 2019


On 2019-02-14 5:57 p.m., Kazlauskas, Nicholas wrote:
> On 2/14/19 11:47 AM, Grodzovsky, Andrey wrote:
>>
>> On 2/14/19 11:07 AM, Michel Dänzer wrote:
>>> On 2019-02-14 4:54 p.m., Kazlauskas, Nicholas wrote:
>>>> On 2/14/19 10:42 AM, Grodzovsky, Andrey wrote:
>>>>> On 2/14/19 4:05 AM, Christian König wrote:
>>>>>> Am 13.02.19 um 19:58 schrieb Andrey Grodzovsky:
>>>>>>> When ring hang happens amdgpu_dm_commit_planes during flip is holding
>>>>>>> the BO reserved and then stack waiting for fences to signal in
>>>>>>> reservation_object_wait_timeout_rcu (which won't signal because there
>>>>>>> was a hnag). Then when we try to shutdown display block during reset
>>>>>>> recovery from drm_atomic_helper_suspend we also try to reserve the BO
>>>>>>> from dm_plane_helper_cleanup_fb ending in deadlock.
>>>>>>> Also remove useless WARN_ON
>>>>>> Well it is good that you pointed this out, but there are more problems
>>>>>> than just waiting wile the BO is reserved here.
>>>>>>
>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19
>>>>>>> +++++++++++++------
>>>>>>>      1 file changed, 13 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>>> index acc4ff8..f8dec36 100644
>>>>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>>>>>> @@ -4802,14 +4802,21 @@ static void amdgpu_dm_commit_planes(struct
>>>>>>> drm_atomic_state *state,
>>>>>>>                   */
>>>>>>>                  abo = gem_to_amdgpu_bo(fb->obj[0]);
>>>>>>>                  r = amdgpu_bo_reserve(abo, true);
>>>>>> Well why do we reserve the BO in the first place? As the name
>>>>>> indicates reservation_object_wait_timeout_rcu() just uses rcu to wait
>>>>>> for the BO to be idle, no need to actually reserve it at all.
>>>>> That a good point, I honestly can't remember any more why it's here...
>>>>> It does look unnecessary given that we are not validating the BO here.
>>>> I think we reserve it to grab the tiling flags to update the plane
>>>> address (and some other bits of the plane state with a patch I'm
>>>> currently working on). I'm guessing that we still need the reservation
>>>> so nothing else tries to grab it after it becomes idle, but I'm not
>>>> overly familiar with that bit.
>>> The BO is already pinned at this point, isn't it? That's enough to
>>> prevent it from moving, no reservation needed.
>>
>>
>> Yes, agree.
>>
>> Andrey
>>
>>
>>>
>>>
> 
> I'm not worried about the buffer moving so much as I am about userspace 
> changing the tiling flags by calling the IOCTL. Can that happen when 
> it's pinned?

Even assuming it can theoretically, that would be a "doctor, it hurts if
I do this" kind of userspace problem. :) Userspace has to set the
correct tiling flags before submitting the flip, and then can't change
them until the BO stops being scanned out.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer


More information about the amd-gfx mailing list