[PATCH] drm/amdgpu: fix for suspend/resume sequence under sriov

Christian König ckoenig.leichtzumerken at gmail.com
Thu Nov 3 07:43:00 UTC 2022


Am 03.11.22 um 05:06 schrieb Victor Zhao:
> - clear kiq ring after suspend/resume under sriov to aviod kiq ring
> test failure
> - update irq after resume to fix kiq interrput loss

Good to see that somebody takes a look into this. Is that enough to get 
suspend/resume with SRIOV going?

>
> Signed-off-by: Victor Zhao <Victor.Zhao at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c     | 2 ++
>   2 files changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 522820eeaa59..5b9f992e4607 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4197,6 +4197,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)
>   	}
>   
>   	/* Make sure IB tests flushed */
> +	if (amdgpu_sriov_vf(adev))
> +		amdgpu_irq_gpu_reset_resume_helper(adev);

This is a pretty clear NAK because that should happen during resume 
anyway. If this doesn't happen we have a bug somewhere else and that 
here just hides it.

>   	flush_delayed_work(&adev->delayed_init_work);
>   
>   	if (adev->in_s0ix) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 7853d3ca58cf..49d34c7bbf20 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -6909,6 +6909,8 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring *ring)
>   		mutex_unlock(&adev->srbm_mutex);
>   	} else {
>   		memset((void *)mqd, 0, sizeof(*mqd));
> +		if (amdgpu_sriov_vf(adev) && adev->in_suspend)
> +			amdgpu_ring_clear_ring(ring);

Is there any good reason to not always clear the KIQ ring here? E.g. 
also on bare metal and during load/reset?

Regards,
Christian.

>   		mutex_lock(&adev->srbm_mutex);
>   		nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
>   		amdgpu_ring_init_mqd(ring);



More information about the amd-gfx mailing list