[PATCH] drm/ttm: Drain workqueue before sys manager release
Karolina Stolarek
karolina.stolarek at intel.com
Fri Oct 13 14:34:23 UTC 2023
In rare cases, a delayed destruction of a BO with a system resource
could stay in the workqueue until drain_workqueue() is called
in ttm_device_fini(). An attempt to free a resource from an already
released manager results in NULL pointer dereference. Move the step
of draining and destroying the workqueue so it happens before
the ttm_sys_manager cleanup.
Fixes: 9bff18d13473 ("drm/ttm: use per BO cleanup workers")
Signed-off-by: Karolina Stolarek <karolina.stolarek at intel.com>
---
Some background: I stumbled upon this issue when testing
ttm_bo_pipeline_gutting() with BO with an active dma_resv fence. In ~2% of
the runs, the delayed destruction of the ghost wouldn't happen until the
drain_queue() step. man->func->free(man, *res) got called via
ttm_bo_cleanup_memtype_use(), the manager and its functions were nowhere to
be seen, resulting in a nulptr deref.
drivers/gpu/drm/ttm/ttm_device.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 7726a72befc5..753126581620 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -232,6 +232,9 @@ void ttm_device_fini(struct ttm_device *bdev)
struct ttm_resource_manager *man;
unsigned i;
+ drain_workqueue(bdev->wq);
+ destroy_workqueue(bdev->wq);
+
man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
ttm_resource_manager_set_used(man, false);
ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
@@ -240,9 +243,6 @@ void ttm_device_fini(struct ttm_device *bdev)
list_del(&bdev->device_list);
mutex_unlock(&ttm_global_mutex);
- drain_workqueue(bdev->wq);
- destroy_workqueue(bdev->wq);
-
spin_lock(&bdev->lru_lock);
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
if (list_empty(&man->lru[0]))
--
2.25.1
More information about the dri-devel
mailing list