[PATCH v2] drm/amdkfd: AIP mGPUs best prefetch location for xnack on

Felix Kuehling felix.kuehling at amd.com
Tue Aug 10 03:52:46 UTC 2021


Am 2021-08-09 um 6:21 p.m. schrieb Philip Yang:
> For xnack on, if range ACCESS or ACCESS_IN_PLACE (AIP) by single GPU, or
> range is ACCESS_IN_PLACE by mGPUs and all mGPUs connection on xgmi same
> hive, the best prefetch location is prefetch_loc GPU. Otherwise, the best
> prefetch location is always CPU because GPU can not map vram of other
> GPUs through small bar PCIe.

I don't think small-bar is really a factor here. Even with large-BAR,
our P2P mappings are not coherent like XGMI mappings are. So we wouldn't
be able to use P2P even on large-BAR systems. So I would modify this
sentence:

> Otherwise, the best
> prefetch location is always CPU because GPU can not coherently map vram
> of other GPUs through PCIe.


>
> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 35 +++++++++++++++-------------
>  1 file changed, 19 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index f811a3a24cd2..5bd51a15fb00 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -2719,22 +2719,26 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size,
>  	return 0;
>  }
>  
> -/* svm_range_best_prefetch_location - decide the best prefetch location
> +/**
> + * svm_range_best_prefetch_location - decide the best prefetch location
>   * @prange: svm range structure
>   *
>   * For xnack off:
> - * If range map to single GPU, the best acutal location is prefetch loc, which
> + * If range map to single GPU, the best prefetch location is prefetch_loc, which
>   * can be CPU or GPU.
>   *
> - * If range map to multiple GPUs, only if mGPU connection on xgmi same hive,
> - * the best actual location could be prefetch_loc GPU. If mGPU connection on
> - * PCIe, the best actual location is always CPU, because GPU cannot access vram
> - * of other GPUs, assuming PCIe small bar (large bar support is not upstream).
> + * If range is ACCESS or ACCESS_IN_PLACE by mGPUs, only if mGPU connection on
> + * XGMI same hive, the best prefetch location is prefetch_loc GPU, othervise
> + * the best prefetch location is always CPU, because GPU can not map vram of
> + * other GPUs, assuming PCIe small bar (large bar support is not upstream).

Same as above. With that fixed, the patch is

Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>


>   *
>   * For xnack on:
> - * The best actual location is prefetch location. If mGPU connection on xgmi
> - * same hive, range map to multiple GPUs. Otherwise, the range only map to
> - * actual location GPU. Other GPU access vm fault will trigger migration.
> + * If range is not ACCESS_IN_PLACE by mGPUs, the best prefetch location is
> + * prefetch_loc, other GPU access will generate vm fault and trigger migration.
> + *
> + * If range is ACCESS_IN_PLACE by mGPUs, only if mGPU connection on XGMI same
> + * hive, the best prefetch location is prefetch_loc GPU, otherwise the best
> + * prefetch location is always CPU, because GPU cannot map vram of other GPUs.
>   *
>   * Context: Process context
>   *
> @@ -2754,11 +2758,6 @@ svm_range_best_prefetch_location(struct svm_range *prange)
>  
>  	p = container_of(prange->svms, struct kfd_process, svms);
>  
> -	/* xnack on */
> -	if (p->xnack_enabled)
> -		goto out;
> -
> -	/* xnack off */
>  	if (!best_loc || best_loc == KFD_IOCTL_SVM_LOCATION_UNDEFINED)
>  		goto out;
>  
> @@ -2768,8 +2767,12 @@ svm_range_best_prefetch_location(struct svm_range *prange)
>  		best_loc = 0;
>  		goto out;
>  	}
> -	bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip,
> -		  MAX_GPU_INSTANCE);
> +
> +	if (p->xnack_enabled)
> +		bitmap_copy(bitmap, prange->bitmap_aip, MAX_GPU_INSTANCE);
> +	else
> +		bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip,
> +			  MAX_GPU_INSTANCE);
>  
>  	for_each_set_bit(gpuidx, bitmap, MAX_GPU_INSTANCE) {
>  		pdd = kfd_process_device_from_gpuidx(p, gpuidx);


More information about the amd-gfx mailing list