[PATCH] drm/amdgpu: fix the Carrizo UVD hang on system reboot

Christian König ckoenig.leichtzumerken at gmail.com
Mon Nov 8 10:26:18 UTC 2021


Am 05.11.21 um 08:58 schrieb Quan, Evan:
> [AMD Official Use Only]
>
>> -----Original Message-----
>> From: Lazar, Lijo <Lijo.Lazar at amd.com>
>> Sent: Thursday, November 4, 2021 4:55 PM
>> To: Quan, Evan <Evan.Quan at amd.com>; amd-gfx at lists.freedesktop.org
>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
>> Subject: Re: [PATCH] drm/amdgpu: fix the Carrizo UVD hang on system
>> reboot
>>
>>
>>
>> On 11/4/2021 1:49 PM, Evan Quan wrote:
>>> It's confirmed that on some APUs the interaction with SMU about DPM
>>> disablement will power off the UVD completely. Thus the succeeding
>>> interactions with UVD during the reboot will trigger hard hang. To
>>> workaround this issue, we will skip the dpm disablement on APUs.
>>>
>>> Signed-off-by: Evan Quan <evan.quan at amd.com>
>>> Change-Id: I4340cc2fb0fd94f439cbac5d4963fe920866bc13
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 20 ++++++++++--------
>>>    drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 30
>> +++++++++++++++++++--------
>>>    drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 18 +++++++++-------
>>>    drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 18 +++++++++-------
>>>    4 files changed, 52 insertions(+), 34 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
>>> b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
>>> index c108b8381795..67ec13622e51 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
>>> @@ -238,15 +238,17 @@ static int uvd_v4_2_suspend(void *handle)
>>>    	 */
>>>    	cancel_delayed_work_sync(&adev->uvd.idle_work);
>> If the idle work handler had already started execution, it also goes through
>> the same logic. Wouldn't that be a problem? The other case is if decode work
>> is already completed before suspend is called, then also it disables DPM. Not
>> sure how it works then, or is this caused by a second atempt to power off
>> again after idle work is executed?
> [Quan, Evan] Good point. Yes, maybe there should not be 2nd attempt when the target IP is already in the desired state.
> Let me confirm with Boris.

It sounds like you not only need to prevent the clock gating, but also 
enable the clocks during shutdown.

Regards,
Christian.

>
> BR
> Evan
>> Thanks,
>> Lijo
>>
>>> -	if (adev->pm.dpm_enabled) {
>>> -		amdgpu_dpm_enable_uvd(adev, false);
>>> -	} else {
>>> -		amdgpu_asic_set_uvd_clocks(adev, 0, 0);
>>> -		/* shutdown the UVD block */
>>> -		amdgpu_device_ip_set_powergating_state(adev,
>> AMD_IP_BLOCK_TYPE_UVD,
>>> -						       AMD_PG_STATE_GATE);
>>> -		amdgpu_device_ip_set_clockgating_state(adev,
>> AMD_IP_BLOCK_TYPE_UVD,
>>> -						       AMD_CG_STATE_GATE);
>>> +	if (!(adev->flags & AMD_IS_APU)) {
>>> +		if (adev->pm.dpm_enabled) {
>>> +			amdgpu_dpm_enable_uvd(adev, false);
>>> +		} else {
>>> +			amdgpu_asic_set_uvd_clocks(adev, 0, 0);
>>> +			/* shutdown the UVD block */
>>> +			amdgpu_device_ip_set_powergating_state(adev,
>> AMD_IP_BLOCK_TYPE_UVD,
>>> +
>> AMD_PG_STATE_GATE);
>>> +			amdgpu_device_ip_set_clockgating_state(adev,
>> AMD_IP_BLOCK_TYPE_UVD,
>>> +
>> AMD_CG_STATE_GATE);
>>> +		}
>>>    	}
>>>
>>>    	r = uvd_v4_2_hw_fini(adev);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>>> index 2d558c2f417d..60d05ec8c953 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>>> @@ -560,15 +560,27 @@ static int uvd_v6_0_suspend(void *handle)
>>>    	 */
>>>    	cancel_delayed_work_sync(&adev->uvd.idle_work);
>>>
>>> -	if (adev->pm.dpm_enabled) {
>>> -		amdgpu_dpm_enable_uvd(adev, false);
>>> -	} else {
>>> -		amdgpu_asic_set_uvd_clocks(adev, 0, 0);
>>> -		/* shutdown the UVD block */
>>> -		amdgpu_device_ip_set_powergating_state(adev,
>> AMD_IP_BLOCK_TYPE_UVD,
>>> -						       AMD_PG_STATE_GATE);
>>> -		amdgpu_device_ip_set_clockgating_state(adev,
>> AMD_IP_BLOCK_TYPE_UVD,
>>> -						       AMD_CG_STATE_GATE);
>>> +	/*
>>> +	 * It's confirmed that on some APUs the interaction with SMU(about
>> DPM disablement)
>>> +	 * will power off the UVD. That will make the succeeding interactions
>> with UVD on the
>>> +	 * suspend path impossible. And the system will hang due to that. To
>> workaround the
>>> +	 * issue, we will skip the dpm disablement on APUs.
>>> +	 *
>>> +	 * TODO: a better solution is to reorg the action chains performed on
>> suspend and make
>>> +	 * the dpm disablement the last one. But that will involve a lot and
>> needs MM team's
>>> +	 * help.
>>> +	 */
>>> +	if (!(adev->flags & AMD_IS_APU)) {
>>> +		if (adev->pm.dpm_enabled) {
>>> +			amdgpu_dpm_enable_uvd(adev, false);
>>> +		} else {
>>> +			amdgpu_asic_set_uvd_clocks(adev, 0, 0);
>>> +			/* shutdown the UVD block */
>>> +			amdgpu_device_ip_set_powergating_state(adev,
>> AMD_IP_BLOCK_TYPE_UVD,
>>> +
>> AMD_PG_STATE_GATE);
>>> +			amdgpu_device_ip_set_clockgating_state(adev,
>> AMD_IP_BLOCK_TYPE_UVD,
>>> +
>> AMD_CG_STATE_GATE);
>>> +		}
>>>    	}
>>>
>>>    	r = uvd_v6_0_hw_fini(adev);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
>>> index 67eb01fef789..8aa9d8c07053 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v2_0.c
>>> @@ -505,14 +505,16 @@ static int vce_v2_0_suspend(void *handle)
>>>    	 */
>>>    	cancel_delayed_work_sync(&adev->vce.idle_work);
>>>
>>> -	if (adev->pm.dpm_enabled) {
>>> -		amdgpu_dpm_enable_vce(adev, false);
>>> -	} else {
>>> -		amdgpu_asic_set_vce_clocks(adev, 0, 0);
>>> -		amdgpu_device_ip_set_powergating_state(adev,
>> AMD_IP_BLOCK_TYPE_VCE,
>>> -						       AMD_PG_STATE_GATE);
>>> -		amdgpu_device_ip_set_clockgating_state(adev,
>> AMD_IP_BLOCK_TYPE_VCE,
>>> -						       AMD_CG_STATE_GATE);
>>> +	if (!(adev->flags & AMD_IS_APU)) {
>>> +		if (adev->pm.dpm_enabled) {
>>> +			amdgpu_dpm_enable_vce(adev, false);
>>> +		} else {
>>> +			amdgpu_asic_set_vce_clocks(adev, 0, 0);
>>> +			amdgpu_device_ip_set_powergating_state(adev,
>> AMD_IP_BLOCK_TYPE_VCE,
>>> +
>> AMD_PG_STATE_GATE);
>>> +			amdgpu_device_ip_set_clockgating_state(adev,
>> AMD_IP_BLOCK_TYPE_VCE,
>>> +
>> AMD_CG_STATE_GATE);
>>> +		}
>>>    	}
>>>
>>>    	r = vce_v2_0_hw_fini(adev);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>>> index 142e291983b4..b177cd442838 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>>> @@ -520,14 +520,16 @@ static int vce_v3_0_suspend(void *handle)
>>>    	 */
>>>    	cancel_delayed_work_sync(&adev->vce.idle_work);
>>>
>>> -	if (adev->pm.dpm_enabled) {
>>> -		amdgpu_dpm_enable_vce(adev, false);
>>> -	} else {
>>> -		amdgpu_asic_set_vce_clocks(adev, 0, 0);
>>> -		amdgpu_device_ip_set_powergating_state(adev,
>> AMD_IP_BLOCK_TYPE_VCE,
>>> -						       AMD_PG_STATE_GATE);
>>> -		amdgpu_device_ip_set_clockgating_state(adev,
>> AMD_IP_BLOCK_TYPE_VCE,
>>> -						       AMD_CG_STATE_GATE);
>>> +	if (!(adev->flags & AMD_IS_APU)) {
>>> +		if (adev->pm.dpm_enabled) {
>>> +			amdgpu_dpm_enable_vce(adev, false);
>>> +		} else {
>>> +			amdgpu_asic_set_vce_clocks(adev, 0, 0);
>>> +			amdgpu_device_ip_set_powergating_state(adev,
>> AMD_IP_BLOCK_TYPE_VCE,
>>> +
>> AMD_PG_STATE_GATE);
>>> +			amdgpu_device_ip_set_clockgating_state(adev,
>> AMD_IP_BLOCK_TYPE_VCE,
>>> +
>> AMD_CG_STATE_GATE);
>>> +		}
>>>    	}
>>>
>>>    	r = vce_v3_0_hw_fini(adev);
>>>



More information about the amd-gfx mailing list