[PATCH] drm/amdgpu: fix reload KMD hang on GFX10 KIQ

Christian König ckoenig.leichtzumerken at gmail.com
Mon Aug 10 11:34:07 UTC 2020


Am 10.08.20 um 05:59 schrieb Monk Liu:
> GFX10 KIQ will hang if we try below steps:
> modprobe amdgpu
> rmmod amdgpu
> modprobe amdgpu sched_hw_submission=4
>
> Due to KIQ is always living there even after KMD unloaded
> thus when doing the realod KIQ will crash upon its register
> being programed by different values with the previous loading
> (the config like HQD addr, ring size, is easily changed if we alter
> the sched_hw_submission)
>
> the fix is we must inactive KIQ first before touching any
> of its registgers
>
> Signed-off-by: Monk Liu <Monk.Liu at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 622f442..0702c94 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -6435,6 +6435,10 @@ static int gfx_v10_0_kiq_init_register(struct amdgpu_ring *ring)
>   	struct v10_compute_mqd *mqd = ring->mqd_ptr;
>   	int j;
>   
> +	/* inactivate the queue */
> +	if (amdgpu_sriov_vf(adev))

Could you think of any reason why we shouldn't do this on bare metal as 
well?

I mean it can't hurt to be extra careful even if the KIQ shouldn't be 
running.

Christian.

> +		WREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE, 0);
> +
>   	/* disable wptr polling */
>   	WREG32_FIELD15(GC, 0, CP_PQ_WPTR_POLL_CNTL, EN, 0);
>   



More information about the amd-gfx mailing list