[PATCH v2] drm/amdgpu: Init zone device and drm client after mode-1 reset on reload
Ahmad Rehman
Ahmad.Rehman at amd.com
Tue Mar 5 17:20:39 UTC 2024
In passthrough environment, when amdgpu is reloaded after unload, mode-1
is triggered after initializing the necessary IPs, That init does not
include KFD, and KFD init waits until the reset is completed. KFD init
is called in the reset handler, but in this case, the zone device and
drm client is not initialized, causing app to create kernel panic.
v2: Removing the init KFD condition from amdgpu_amdkfd_drm_client_create.
As the previous version has the potential of creating DRM client twice.
Signed-off-by: Ahmad Rehman <Ahmad.Rehman at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 3 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index f5f2945711be..c229dbdf5d47 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -146,9 +146,6 @@ int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev)
{
int ret;
- if (!adev->kfd.init_complete)
- return 0;
-
ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd",
&kfd_client_funcs);
if (ret) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 15b188aaf681..e783f5c8f5ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2479,8 +2479,10 @@ static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work)
}
for (i = 0; i < mgpu_info.num_dgpu; i++) {
adev = mgpu_info.gpu_ins[i].adev;
- if (!adev->kfd.init_complete)
+ if (!adev->kfd.init_complete) {
+ kgd2kfd_init_zone_device(adev);
amdgpu_amdkfd_device_init(adev);
+ }
amdgpu_ttm_set_buffer_funcs_status(adev, true);
}
}
--
2.34.1
More information about the amd-gfx
mailing list