[PATCH 1/2] drm/amdkfd: wait migration done only if migration starts

Felix Kuehling felix.kuehling at amd.com
Thu Apr 29 06:10:31 UTC 2021


Am 2021-04-28 um 9:53 p.m. schrieb Philip Yang:

> If migration vma setup, but failed before start sdma memory copy, e.g.
> process is killed, don't wait for sdma fence done.

I think you could describe this more generally as "Handle errors
returned by svm_migrate_copy_to_vram/ram".


>
> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> index 6b810863f6ba..19b08247ba8a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> @@ -460,10 +460,12 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct svm_range *prange,
>  	}
>  
>  	if (migrate.cpages) {
> -		svm_migrate_copy_to_vram(adev, prange, &migrate, &mfence,
> -					 scratch);
> -		migrate_vma_pages(&migrate);
> -		svm_migrate_copy_done(adev, mfence);
> +		r = svm_migrate_copy_to_vram(adev, prange, &migrate, &mfence,
> +					     scratch);
> +		if (!r) {
> +			migrate_vma_pages(&migrate);
> +			svm_migrate_copy_done(adev, mfence);

I think there are failure cases where svm_migrate_copy_to_vram
successfully copies some pages but fails somewhere in the middle. I
think in those cases you still want to call migrate_vma_pages and
svm_migrate_copy_done. If the copy never started for some reason, there
should be no mfence and svm_migrate_copy_done should be a no-op.

I probably don't understand the failure scenario you encountered. Can
you explain that in more detail?

Thanks,
  Felix


> +		}
>  		migrate_vma_finalize(&migrate);
>  	}
>  
> @@ -663,10 +665,12 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct svm_range *prange,
>  	pr_debug("cpages %ld\n", migrate.cpages);
>  
>  	if (migrate.cpages) {
> -		svm_migrate_copy_to_ram(adev, prange, &migrate, &mfence,
> -					scratch);
> -		migrate_vma_pages(&migrate);
> -		svm_migrate_copy_done(adev, mfence);
> +		r = svm_migrate_copy_to_ram(adev, prange, &migrate, &mfence,
> +					    scratch);
> +		if (!r) {
> +			migrate_vma_pages(&migrate);
> +			svm_migrate_copy_done(adev, mfence);
> +		}
>  		migrate_vma_finalize(&migrate);
>  	} else {
>  		pr_debug("failed collect migrate device pages [0x%lx 0x%lx]\n",


More information about the amd-gfx mailing list