[PATCH] drm/amdkfd: flag added to handle errors from svm validate and map

Felix Kuehling felix.kuehling at amd.com
Mon May 29 21:38:46 UTC 2023


On 2023-05-29 17:11, Alex Sierra wrote:
> If a return error is raised during validation and mapping of a
> prange, this flag is set. It is a rare occurrence, but it could happen
> when `amdgpu_hmm_range_get_pages_done` returns true. In such cases,
> the caller should retry. However, it is important to ensure that the
> prange is updated correctly during the retry.
>
> Signed-off-by: Alex Sierra <alex.sierra at amd.com>

Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>


> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 ++-
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 1 +
>   2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index fcfde9140bce..910c0269598a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -823,7 +823,7 @@ svm_range_is_same_attrs(struct kfd_process *p, struct svm_range *prange,
>   		}
>   	}
>   
> -	return true;
> +	return !prange->is_error_flag;
>   }
>   
>   /**
> @@ -1657,6 +1657,7 @@ static int svm_range_validate_and_map(struct mm_struct *mm,
>   unreserve_out:
>   	svm_range_unreserve_bos(&ctx);
>   
> +	prange->is_error_flag = !!r;
>   	if (!r)
>   		prange->validate_timestamp = ktime_get_boottime();
>   
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
> index 7a33b93f9df6..b716d4bf7ee0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
> @@ -133,6 +133,7 @@ struct svm_range {
>   	DECLARE_BITMAP(bitmap_aip, MAX_GPU_INSTANCE);
>   	bool				validated_once;
>   	bool				mapped_to_gpu;
> +	bool				is_error_flag;
>   };
>   
>   static inline void svm_range_lock(struct svm_range *prange)


More information about the amd-gfx mailing list