[PATCH] drm/amdgpu: Disable ras features on all IPs before gpu reset

Grodzovsky, Andrey Andrey.Grodzovsky at amd.com
Thu Jul 4 13:36:47 UTC 2019


Acked-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>

Andrey

On 7/3/19 11:09 PM, Pan, Xinhui wrote:
> Perform a ras_suspend to disable ras on all IPs to workaround
> some ROCm stability issue.
>
> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5132c59b4397..99208fe684aa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3759,6 +3759,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>   
>   	/* block all schedulers and reset given job's ring */
>   	list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) {
> +		/* disable ras on ALL IPs */
> +		if (amdgpu_device_ip_need_full_reset(tmp_adev))
> +			amdgpu_ras_suspend(tmp_adev);
> +
>   		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   			struct amdgpu_ring *ring = tmp_adev->rings[i];
>   


More information about the amd-gfx mailing list