[Mesa-dev] [PATCH 4/8] i965: Account for poor address calculations in Haswell CS scratch size.

Kenneth Graunke kenneth at whitecape.org
Fri Jun 10 20:05:16 UTC 2016


Curro figured this out by investigating the simulator.  Apparently
there's also a workaround in the Windows driver.  I'm not sure it's
actually documented anywhere.

We were underallocating the scratch buffer by a factor of 128/70.

Cc: "12.0" <mesa-stable at lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
---
 src/mesa/drivers/dri/i965/brw_cs.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cs.c b/src/mesa/drivers/dri/i965/brw_cs.c
index c8598d6..329adff 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.c
+++ b/src/mesa/drivers/dri/i965/brw_cs.c
@@ -150,9 +150,28 @@ brw_codegen_cs_prog(struct brw_context *brw,
 
    if (prog_data.base.total_scratch) {
       const unsigned subslices = MAX2(brw->intelScreen->subslice_total, 1);
+
+      /* WaCSScratchSize:hsw
+       *
+       * Haswell's scratch space address calculation appears to be sparse
+       * rather than tightly packed.  The Thread ID has bits indicating
+       * which subslice, EU within a subslice, and thread within an EU
+       * it is.  There's a maximum of two slices and two subslices, so these
+       * can be stored with a single bit.  Even though there are only 10 EUs
+       * per subslice, this is stored in 4 bits, so there's an effective
+       * maximum value of 16 EUs.  Similarly, although there are only 7
+       * threads per EU, this is stored in a 3 bit number, giving an effective
+       * maximum value of 8 threads per EU.
+       *
+       * This means that we need to use 16 * 8 instead of 10 * 7 for the
+       * number of threads per subslice.
+       */
+      const unsigned threads_per_subslice =
+         brw->is_haswell ? 16 * 8 : brw->max_cs_threads;
+
       brw_get_scratch_bo(brw, &brw->cs.base.scratch_bo,
                          prog_data.base.total_scratch *
-                         brw->max_cs_threads * subslices);
+                         threads_per_subslice * subslices);
    }
 
    if (unlikely(INTEL_DEBUG & DEBUG_CS))
-- 
2.8.3



More information about the mesa-dev mailing list