[RFC 3/3] drm/amdgpu: Add locking to amdgpu_ctx_mgr_entity_fini()
Tvrtko Ursulin
tvrtko.ursulin at igalia.com
Mon May 19 16:37:13 UTC 2025
Amdgpu_ctx_mgr_entity_fini() walks the context IDR unlocked so question is
could it in theory see a stale entry and attempt to destroy the
drm_sched_entity twice?
Problem is I have hit this on a KASAN enabled kernel only _once_ and never
since. In that case it reported amdgpu_ctx_ioctl() had freed the entity
already which would mean the question is could we possibly go through the
mutex_unlock() on one CPU, and another CPU to follow immediately with
file->release (DRM postclose) and see the stale entry.
Throwing it out there not to forget about it, since I have manage to
lose the KASAN trace already..
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
Cc: Alex Deucher <alexander.deucher at amd.com>
Cc: Christian König <christian.koenig at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 85567d0d9545..95b005ed839e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -927,6 +927,7 @@ static void amdgpu_ctx_mgr_entity_fini(struct amdgpu_ctx_mgr *mgr)
idp = &mgr->ctx_handles;
+ mutex_lock(&mgr->lock);
idr_for_each_entry(idp, ctx, id) {
if (kref_read(&ctx->refcount) != 1) {
DRM_ERROR("ctx %p is still alive\n", ctx);
@@ -945,6 +946,7 @@ static void amdgpu_ctx_mgr_entity_fini(struct amdgpu_ctx_mgr *mgr)
}
}
}
+ mutex_unlock(&mgr->lock);
}
void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr)
--
2.48.0
More information about the amd-gfx
mailing list