[Intel-xe] [PATCH] drm/xe/bo: handle PL_TT -> PL_TT

Matthew Auld matthew.auld at intel.com
Thu Jun 15 18:18:40 UTC 2023


When moving between PL_VRAM <-> PL_SYSTEM we have to have use PL_TT in
the middle as a temporary resource for the actual copy. In some GL
workloads it can be seen that once the resource has been moved to the
PL_TT we might have to bail out of the ttm_bo_validate(), before
finishing the final hop. If this happens the resource is left as
TTM_PL_FLAG_TEMPORARY, and when the ttm_bo_validate() is restarted the
current placement is always seen as incompatible, requiring us to
complete the move.  However if the BO allows PL_TT as a possible
placement we can end up attempting a PL_TT -> PL_TT move (like when
running out of VRAM) which leads to explosions in xe_bo_move(), like
triggering the XE_BUG_ON(!tile).

Going from TTM_PL_FLAG_TEMPORARY with PL_TT -> PL_VRAM should already
work as-is, so it looks like we only need to worry about PL_TT -> PL_TT
and it looks like we can just treat it as a dummy move, since no real
move is needed.

Reported-by: José Roberto de Souza <jose.souza at intel.com>
Signed-off-by: Matthew Auld <matthew.auld at intel.com>
Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
---
 drivers/gpu/drm/xe/xe_bo.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index b94a80a32d86..5aed626cce80 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -603,6 +603,16 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
 		goto out;
 	}
 
+	/*
+	 * Failed multi-hop where the old_mem is still marked as
+	 * TTM_PL_FLAG_TEMPORARY, should just be a dummy move.
+	 */
+	if (old_mem->mem_type == XE_PL_TT &&
+	    new_mem->mem_type == XE_PL_TT) {
+		ttm_bo_move_null(ttm_bo, new_mem);
+		goto out;
+	}
+
 	if (!move_lacks_source && !xe_bo_is_pinned(bo)) {
 		ret = xe_bo_move_notify(bo, ctx);
 		if (ret)
-- 
2.40.1



More information about the Intel-xe mailing list