[PATCH 2/2] drm/amdgpu: abort KIQ waits when there is a pending reset

Lazar, Lijo lijo.lazar at amd.com
Sat Aug 3 15:12:27 UTC 2024



On 8/3/2024 12:09 AM, Victor Skvortsov wrote:
> Stop waiting for the KIQ to return back when there is a reset pending.
> It's quite likely that the KIQ will never response.
> 
> Signed-off-by: Victor Skvortsov <victor.skvortsov at amd.com>

Copying Christian/Vignesh

The patch is originally from Christian. Please keep the author as
Christian and you may add Tested-By.

Thanks,
Lijo

> Suggested-by: Lazar Lijo <Lijo.Lazar at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   | 3 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 5 +++++
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index c02659025656..8962be257942 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -785,7 +785,8 @@ void amdgpu_gmc_fw_reg_write_reg_wait(struct amdgpu_device *adev,
>  		goto failed_kiq;
>  
>  	might_sleep();
> -	while (r < 1 && cnt++ < MAX_KIQ_REG_TRY) {
> +	while (r < 1 && cnt++ < MAX_KIQ_REG_TRY&&
> +		!amdgpu_reset_pending(adev->reset_domain)) {
>  
>  		msleep(MAX_KIQ_REG_BAILOUT_INTERVAL);
>  		r = amdgpu_fence_wait_polling(ring, seq, MAX_KIQ_REG_WAIT);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> index 4ae581f3fcb5..f33a4e0ffba1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> @@ -136,6 +136,11 @@ static inline bool amdgpu_reset_domain_schedule(struct amdgpu_reset_domain *doma
>  	return queue_work(domain->wq, work);
>  }
>  
> +static inline bool amdgpu_reset_pending(struct amdgpu_reset_domain *domain) {
> +	lockdep_assert_held(&domain->sem);
> +	return rwsem_is_contended(&domain->sem);
> +}
> +
>  void amdgpu_device_lock_reset_domain(struct amdgpu_reset_domain *reset_domain);
>  
>  void amdgpu_device_unlock_reset_domain(struct amdgpu_reset_domain *reset_domain);


More information about the amd-gfx mailing list