[PATCH v2] drm/amdkfd: fix some race conditions in vram buffer alloc/free of svm code
Felix Kuehling
felix.kuehling at amd.com
Thu Sep 21 19:33:35 UTC 2023
On 2023-09-20 12:09, Xiaogang.Chen wrote:
> From: Xiaogang Chen <xiaogang.chen at amd.com>
>
> This patch fixes:
> 1: ref number of prange's svm_bo got decreased by an async call from hmm. When
> wait svm_bo of prange got released we shoul also wait prang->svm_bo become NULL,
> otherwise prange->svm_bo may be set to null after allocate new vram buffer.
>
> 2: During waiting svm_bo of prange got released in a while loop should reschedule
> current task to give other tasks oppotunity to run, specially the the workque
> task that handles svm_bo ref release, otherwise we may enter to softlock.
>
> Signed-off-by: Xiaogang.Chen <Xiaogang.Chen at amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index bed0f8bf83c7..164cd77af62d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -502,11 +502,11 @@ svm_range_validate_svm_bo(struct kfd_node *node, struct svm_range *prange)
>
> /* We need a new svm_bo. Spin-loop to wait for concurrent
> * svm_range_bo_release to finish removing this range from
> - * its range list. After this, it is safe to reuse the
> - * svm_bo pointer and svm_bo_list head.
> + * its range list and set prange->svm_bo to null. After this,
> + * it is safe to reuse the svm_bo pointer and svm_bo_list head.
> */
> - while (!list_empty_careful(&prange->svm_bo_list))
> - ;
> + while (!list_empty_careful(&prange->svm_bo_list) || prange->svm_bo)
> + cond_resched();
>
> return false;
> }
More information about the amd-gfx
mailing list