[PATCH] drm/amdgpu: change kgd2kfd_init_zone sequence during device_init
Shikang Fan
shikang.fan at amd.com
Wed Jul 31 08:10:59 UTC 2024
Move kgd2kfd_init _zone_device() after release_full_gpu()
as it takes long time for asics with huge bar size and it could
potentially hit full access timeout for SRIOV during init.
Signed-off-by: Shikang Fan <shikang.fan at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3a43754e7f10..4494fa7ae70f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2930,10 +2930,8 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
amdgpu_ttm_set_buffer_funcs_status(adev, true);
/* Don't init kfd if whole hive need to be reset during init */
- if (!adev->gmc.xgmi.pending_reset) {
- kgd2kfd_init_zone_device(adev);
+ if (!adev->gmc.xgmi.pending_reset)
amdgpu_amdkfd_device_init(adev);
- }
amdgpu_fru_get_product_info(adev);
@@ -4362,6 +4360,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
flush_delayed_work(&adev->delayed_init_work);
}
+ /* On asics with huge bar size, memremap_pages can take long time
+ * and potentially leading to full access timeout for SRIOV. Move
+ * init_zone_device() after exit full gpu
+ */
+ if (!adev->gmc.xgmi.pending_reset)
+ kgd2kfd_init_zone_device(adev);
+
/*
* Place those sysfs registering after `late_init`. As some of those
* operations performed in `late_init` might affect the sysfs
--
2.34.1
More information about the amd-gfx
mailing list