[PATCH] drm/amdkfd: AIP mGPUs best prefetch location for xnack on
Philip Yang
Philip.Yang at amd.com
Mon Aug 2 16:28:06 UTC 2021
For xnack on, if range ACCESS or ACCESS_IN_PLACE (AIP) by single GPU, or
range is ACCESS_IN_PLACE by mGPUs and all mGPUs connection on xgmi same
hive, the best prefetch location is prefetch_loc GPU. Otherwise, the best
prefetch location is always CPU because GPU can not map vram of other
GPUs through small bar PCIe.
Signed-off-by: Philip Yang <Philip.Yang at amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 44 +++++++++++++++++-----------
1 file changed, 27 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 69237d2ab2ad..6160c301f78b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2754,22 +2754,29 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size,
return 0;
}
-/* svm_range_best_prefetch_location - decide the best prefetch location
+/**
+ * svm_range_best_prefetch_location - decide the best prefetch location
* @prange: svm range structure
*
* For xnack off:
- * If range map to single GPU, the best acutal location is prefetch loc, which
+ * If range map to single GPU, the best prefetch location is prefetch_loc, which
* can be CPU or GPU.
*
- * If range map to multiple GPUs, only if mGPU connection on xgmi same hive,
- * the best actual location could be prefetch_loc GPU. If mGPU connection on
- * PCIe, the best actual location is always CPU, because GPU cannot access vram
- * of other GPUs, assuming PCIe small bar (large bar support is not upstream).
+ * If range is ACCESS or ACCESS_IN_PLACE by mGPUs, only if mGPU connection on
+ * xgmi same hive, the best prefetch location is prefetch_loc GPU. If mGPUs
+ * connection on PCIe, the best prefetch location is always CPU, because GPU
+ * cannot map vram of other GPUs, assuming PCIe small bar (large bar support is
+ * not upstream).
*
* For xnack on:
- * The best actual location is prefetch location. If mGPU connection on xgmi
- * same hive, range map to multiple GPUs. Otherwise, the range only map to
- * actual location GPU. Other GPU access vm fault will trigger migration.
+ * If range map to single GPU or is not ACCESS_IN_PLACE by mGPUs, the best
+ * prefetch location is prefetch_loc, other GPU access vm fault will trigger
+ * migration.
+ *
+ * If range is ACCESS_IN_PLACE by mGPUs, only if mGPU connection on xgmi same
+ * hive, the best prefetch location is prefetch_loc GPU. If mGPUs connection on
+ * PCIe, the best prefetch location is always CPU, because GPU cannot map vram
+ * of other GPUs.
*
* Context: Process context
*
@@ -2787,14 +2794,13 @@ svm_range_best_prefetch_location(struct svm_range *prange)
struct kfd_process *p;
uint32_t gpuidx;
- p = container_of(prange->svms, struct kfd_process, svms);
-
- /* xnack on */
- if (p->xnack_enabled)
+ if (!best_loc || best_loc == KFD_IOCTL_SVM_LOCATION_UNDEFINED)
goto out;
- /* xnack off */
- if (!best_loc || best_loc == KFD_IOCTL_SVM_LOCATION_UNDEFINED)
+ p = container_of(prange->svms, struct kfd_process, svms);
+
+ if (p->xnack_enabled &&
+ bitmap_weight(prange->bitmap_aip, MAX_GPU_INSTANCE) <= 1)
goto out;
bo_adev = svm_range_get_adev_by_id(prange, best_loc);
@@ -2803,8 +2809,12 @@ svm_range_best_prefetch_location(struct svm_range *prange)
best_loc = 0;
goto out;
}
- bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip,
- MAX_GPU_INSTANCE);
+
+ if (p->xnack_enabled)
+ bitmap_copy(bitmap, prange->bitmap_aip, MAX_GPU_INSTANCE);
+ else
+ bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip,
+ MAX_GPU_INSTANCE);
for_each_set_bit(gpuidx, bitmap, MAX_GPU_INSTANCE) {
pdd = kfd_process_device_from_gpuidx(p, gpuidx);
--
2.17.1
More information about the amd-gfx
mailing list