[PATCH] drm/amdgpu: fix Polaris12 uvd crash on driver unload

Lazar, Lijo lijo.lazar at amd.com
Mon Oct 18 09:15:28 UTC 2021



On 10/18/2021 2:38 PM, Quan, Evan wrote:
> [AMD Official Use Only]
> 
> 
> 
>> -----Original Message-----
>> From: Lazar, Lijo <Lijo.Lazar at amd.com>
>> Sent: Monday, October 18, 2021 4:05 PM
>> To: Quan, Evan <Evan.Quan at amd.com>; amd-gfx at lists.freedesktop.org
>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Grodzovsky,
>> Andrey <Andrey.Grodzovsky at amd.com>
>> Subject: Re: [PATCH] drm/amdgpu: fix Polaris12 uvd crash on driver unload
>>
>>
>>
>> On 10/18/2021 1:06 PM, Quan, Evan wrote:
>>> [AMD Official Use Only]
>>>
>>> Ping..
>>>
>>>> -----Original Message-----
>>>> From: Quan, Evan <Evan.Quan at amd.com>
>>>> Sent: Monday, October 11, 2021 4:28 PM
>>>> To: amd-gfx at lists.freedesktop.org
>>>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>; Grodzovsky,
>>>> Andrey <Andrey.Grodzovsky at amd.com>; Quan, Evan
>> <Evan.Quan at amd.com>
>>>> Subject: [PATCH] drm/amdgpu: fix Polaris12 uvd crash on driver unload
>>>>
>>>> This is a supplement for the change below:
>>>> cdccf1ffe1a3 drm/amdgpu: Fix crash on device remove/driver unload
>>>>
>>>> Signed-off-by: Evan Quan <evan.quan at amd.com>
>>>> Change-Id: Iedc25e2f572f04772511d56781b01b481e22fd00
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 24 +++++++++++++--------
>> ---
>>>>    1 file changed, 13 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>>>> b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>>>> index d5d023a24269..2d558c2f417d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>>>> @@ -534,6 +534,19 @@ static int uvd_v6_0_hw_fini(void *handle)  {
>>>>    	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>>
>>>> +	cancel_delayed_work_sync(&adev->uvd.idle_work);
>>>> +
>>
>> To solve Boris' issue, this patch should be modified such that DPM disable
>> done by the idle job shouldn't be executed during hw_fini.
>> Preventing powergate during suspend is not needed.
> [Quan, Evan] This is not intended to fix Boris' issue. It just adds the missing in previous Andrey's fix
> cdccf1ffe1a3 drm/amdgpu: Fix crash on device remove/driver unload
> 

What I meant is - this cancel delayed work job has the potential 
(depending on timing) to create the issue that Boris faced during 
reboot. Additionally this and the below steps will fix Boris' issue (at 
least I believe so) once DPM disable is skipped in idle job during hw_fini.

Thanks,
Lijo

> BR
> Evan
>>
>> Thanks,
>> Lijo
>>
>>>> +	if (RREG32(mmUVD_STATUS) != 0)
>>>> +		uvd_v6_0_stop(adev);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int uvd_v6_0_suspend(void *handle) {
>>>> +	int r;
>>>> +	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> +
>>>>    	/*
>>>>    	 * Proper cleanups before halting the HW engine:
>>>>    	 *   - cancel the delayed idle work
>>>> @@ -558,17 +571,6 @@ static int uvd_v6_0_hw_fini(void *handle)
>>>>    						       AMD_CG_STATE_GATE);
>>>>    	}
>>>>
>>>> -	if (RREG32(mmUVD_STATUS) != 0)
>>>> -		uvd_v6_0_stop(adev);
>>>> -
>>>> -	return 0;
>>>> -}
>>>> -
>>>> -static int uvd_v6_0_suspend(void *handle) -{
>>>> -	int r;
>>>> -	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>>> -
>>>>    	r = uvd_v6_0_hw_fini(adev);
>>>>    	if (r)
>>>>    		return r;
>>>> --
>>>> 2.29.0


More information about the amd-gfx mailing list