[PATCH] drm/amdgpu: fix reset domain xgmi hive info reference leak

Jonathan Kim jonathan.kim at amd.com
Thu Aug 11 13:42:17 UTC 2022


When an xgmi node is added to the hive, it takes another hive
reference for its reset domain.

This extra reference was not dropped on device removal from the
hive so drop it.

Signed-off-by: Jonathan Kim <jonathan.kim at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 1b108d03e785..560bf1c98f08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -731,6 +731,9 @@ int amdgpu_xgmi_remove_device(struct amdgpu_device *adev)
 	mutex_unlock(&hive->hive_lock);
 
 	amdgpu_put_xgmi_hive(hive);
+	/* device is removed from the hive so remove its reset domain reference */
+	if (adev->reset_domain && adev->reset_domain == hive->reset_domain)
+		amdgpu_put_xgmi_hive(hive);
 	adev->hive = NULL;
 
 	if (atomic_dec_return(&hive->number_devices) == 0) {
-- 
2.25.1



More information about the amd-gfx mailing list