[PATCH] drm/amdgpu: Check hive->reset_domain not NULL before releasing it.
Gavin Wan
Gavin.Wan at amd.com
Tue Nov 1 18:49:13 UTC 2022
The recent change brought a bug on SRIOV envrionment. It caused
kernel crashing while unloading amdgpu on guest VM with hive
configuration. The reason is that the hive->reset_domain is not
used (hive->reset_domain is not initialized) for SRIOV, but the
code did not check if hive->reset_domain before releasing.
The hive->reset_domain need be checked not NULL before releasing.
Fixed: d95e8e97e2d5 ("drm/amdgpu: refine create and release logic of hive info")
Signed-off-by: Gavin Wan <Gavin.Wan at amd.com>
Change-Id: I17189e4d7357e399c6b70e43c24051356c025a3a
---
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 47159e9a0884..371c4f1aac2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -217,8 +217,15 @@ static void amdgpu_xgmi_hive_release(struct kobject *kobj)
struct amdgpu_hive_info *hive = container_of(
kobj, struct amdgpu_hive_info, kobj);
- amdgpu_reset_put_reset_domain(hive->reset_domain);
- hive->reset_domain = NULL;
+ /**
+ * The hive->reset_domain is only initialized for none SRIOV
+ * configuration. It needs to check if hive->reset_domain is
+ * NULL.
+ */
+ if (hive->reset_domain) {
+ amdgpu_reset_put_reset_domain(hive->reset_domain);
+ hive->reset_domain = NULL;
+ }
mutex_destroy(&hive->hive_lock);
kfree(hive);
--
2.34.1
More information about the amd-gfx
mailing list