[PATCH 2/2] drm/amdgpu: reset gpu for pm abort case

Lazar, Lijo lijo.lazar at amd.com
Thu Jan 25 13:11:18 UTC 2024



On 1/25/2024 8:52 AM, Prike Liang wrote:
> In the pm abort case the gfx power rail not turn off from FCH side and
> this will lead to the gfx reinitialized failed base on the unknown gfx
> HW status, so let's reset the gpu to a known good power state.
> 

>From the description, this an APU only problem (or this patch could only
resolve APU abort sequence). However, there is no check for APU in the
patch below.


> Signed-off-by: Prike Liang <Prike.Liang at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++++
>  drivers/gpu/drm/amd/amdgpu/soc15.c         | 8 +++++++-
>  2 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 56d9dfa61290..4c40ffaaa5c2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4627,6 +4627,11 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)
>  			return r;
>  	}
>  
> +	if(amdgpu_asic_need_reset_on_init(adev)) {
> +		DRM_INFO("PM abort case and let's reset asic \n");
> +		amdgpu_asic_reset(adev);
> +	}
> +

suspend_noirq is specific for suspend scenarios and not valid for
freeze/thaw. I guess this could trigger reset for successful restore on
APUs.

>  	if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>  		return 0;
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index 15033efec2ba..9329a00b6abc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -804,9 +804,16 @@ static bool soc15_need_reset_on_init(struct amdgpu_device *adev)
>  	if (adev->asic_type == CHIP_RENOIR)
>  		return true;
>  
> +	sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);
> +
>  	/* Just return false for soc15 GPUs.  Reset does not seem to
>  	 * be necessary.
>  	 */

The comment now doesn't make sense.

Thanks,
Lijo

> +	if (adev->in_suspend && !adev->in_s0ix &&
> +			!adev->pm_complete &&
> +			sol_reg)
> +		return true;
> +
>  	if (!amdgpu_passthrough(adev))
>  		return false;
>  
> @@ -816,7 +823,6 @@ static bool soc15_need_reset_on_init(struct amdgpu_device *adev)
>  	/* Check sOS sign of life register to confirm sys driver and sOS
>  	 * are already been loaded.
>  	 */
> -	sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);
>  	if (sol_reg)
>  		return true;
>  


More information about the amd-gfx mailing list