[igt-dev] [PATCH i-g-t 2/2] lib: Apply rightmost execution mask to xehp gpu walker

Christoph Manszewski christoph.manszewski at intel.com
Fri Jun 9 09:37:19 UTC 2023


The final thread group in each row may need partial execution (not the
full simd16) in order to fit within the region of interest.

Signed-off-by: Chris Wilson <chris.p.wilson at intel.com>
Signed-off-by: Christoph Manszewski <christoph.manszewski at intel.com>
---
 lib/gpu_cmds.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/gpu_cmds.c b/lib/gpu_cmds.c
index aecba928..48fe1e13 100644
--- a/lib/gpu_cmds.c
+++ b/lib/gpu_cmds.c
@@ -953,7 +953,7 @@ xehp_emit_compute_walk(struct intel_bb *ibb,
 		       struct xehp_interface_descriptor_data *pidd,
 		       uint8_t color)
 {
-	uint32_t x_dim, y_dim;
+	uint32_t x_dim, y_dim, mask;
 
 	/*
 	 * Simply do SIMD16 based dispatch, so every thread uses
@@ -969,6 +969,12 @@ xehp_emit_compute_walk(struct intel_bb *ibb,
 	x_dim = (x + width + 15) / 16;
 	y_dim = y + height;
 
+	mask = (x + width) & 15;
+	if (mask == 0)
+		mask = (1 << 16) - 1;
+	else
+		mask = (1 << mask) - 1;
+
 	intel_bb_out(ibb, XEHP_COMPUTE_WALKER | 0x25);
 
 	intel_bb_out(ibb, 0); /* debug object */		//dw1
@@ -980,7 +986,7 @@ xehp_emit_compute_walk(struct intel_bb *ibb,
 	intel_bb_out(ibb, 1 << 30 | 1 << 25 | 1 << 17);		//dw4
 
 	/* Execution mask */
-	intel_bb_out(ibb, 0xffffffff);				//dw5
+	intel_bb_out(ibb, mask);				//dw5
 
 	/* x/y/z max */
 	intel_bb_out(ibb, (x_dim << 20) | (y_dim << 10) | 1);	//dw6
-- 
2.40.1



More information about the igt-dev mailing list