[PATCH 1/2] drm/amdgpu: fix reload KMD hang on KIQ

Felix Kuehling felix.kuehling at amd.com
Fri Jul 31 13:57:11 UTC 2020


In gfx_v10_0_sw_fini the KIQ ring gets freed. Wouldn't that be the right
place to stop the KIQ? Otherwise KIQ will hang as soon as someone
allocates the memory that was previously used for the KIQ ring buffer
and overwrites it with something that's not a valid PM4 packet.

Regards,
  Felix

Am 2020-07-31 um 3:51 a.m. schrieb Monk Liu:
> KIQ will hang if we try below steps:
> modprobe amdgpu
> rmmod amdgpu
> modprobe amdgpu sched_hw_submission=4
>
> the cause is that due to KIQ is always living there even
> after we unload KMD thus when doing the realod of KMD
> KIQ will crash upon its register programed with different
> values with the previous configuration (the config
> like HQD addr, ring size, is easily changed if we alter
> the sched_hw_submission)
>
> the fix is we must inactive KIQ first before touching any
> of its registgers
>
> Signed-off-by: Monk Liu <Monk.Liu at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index db9f1e8..f571e25 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -6433,6 +6433,9 @@ static int gfx_v10_0_kiq_init_register(struct amdgpu_ring *ring)
>  	struct v10_compute_mqd *mqd = ring->mqd_ptr;
>  	int j;
>  
> +	/* activate the queue */
> +	WREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE, 0);
> +
>  	/* disable wptr polling */
>  	WREG32_FIELD15(GC, 0, CP_PQ_WPTR_POLL_CNTL, EN, 0);
>  


More information about the amd-gfx mailing list