[PATCH] drm/amdgpu: fix gpu recovery disable with per queue reset

Lazar, Lijo lijo.lazar at amd.com
Thu Jan 9 06:14:06 UTC 2025



On 1/9/2025 1:31 AM, Jonathan Kim wrote:
> Per queue reset should be bypassed when gpu recovery is disabled
> with module parameter.
> 
> Signed-off-by: Jonathan Kim <jonathan.kim at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> index cc66ebb7bae1..441568163e20 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> @@ -1131,6 +1131,9 @@ uint64_t kgd_gfx_v9_hqd_get_pq_addr(struct amdgpu_device *adev,
>  	uint32_t low, high;
>  	uint64_t queue_addr = 0;
>  
> +	if (!amdgpu_gpu_recovery)
> +		return 0;
> +
>  	kgd_gfx_v9_acquire_queue(adev, pipe_id, queue_id, inst);
>  	amdgpu_gfx_rlc_enter_safe_mode(adev, inst);
>  
> @@ -1179,6 +1182,9 @@ uint64_t kgd_gfx_v9_hqd_reset(struct amdgpu_device *adev,
>  	uint32_t low, high, pipe_reset_data = 0;
>  	uint64_t queue_addr = 0;
>  
> +	if (!amdgpu_gpu_recovery)
> +		return 0;
> +

I think the right place for this check is not inside callback, should be
from the place where the callback gets called.

Thanks,
Lijo

>  	kgd_gfx_v9_acquire_queue(adev, pipe_id, queue_id, inst);
>  	amdgpu_gfx_rlc_enter_safe_mode(adev, inst);
>  



More information about the amd-gfx mailing list