[Mesa-stable] [PATCH 2/5] intel/vulkan: Hard code scratch_ids_per_subslice for Cherryview
Jordan Justen
jordan.l.justen at intel.com
Wed Mar 7 08:16:27 UTC 2018
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.
Cc: Kenneth Graunke <kenneth at whitecape.org>
Cc: <mesa-stable at lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
---
src/intel/vulkan/anv_allocator.c | 45 +++++++++++++++++++++++++---------------
1 file changed, 28 insertions(+), 17 deletions(-)
diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index fe14d6cfabd..423e863a9e0 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -1097,24 +1097,35 @@ anv_scratch_pool_alloc(struct anv_device *device, struct anv_scratch_pool *pool,
&device->instance->physicalDevice;
const struct gen_device_info *devinfo = &physical_device->info;
- /* WaCSScratchSize:hsw
- *
- * Haswell's scratch space address calculation appears to be sparse
- * rather than tightly packed. The Thread ID has bits indicating which
- * subslice, EU within a subslice, and thread within an EU it is.
- * There's a maximum of two slices and two subslices, so these can be
- * stored with a single bit. Even though there are only 10 EUs per
- * subslice, this is stored in 4 bits, so there's an effective maximum
- * value of 16 EUs. Similarly, although there are only 7 threads per EU,
- * this is stored in a 3 bit number, giving an effective maximum value
- * of 8 threads per EU.
- *
- * This means that we need to use 16 * 8 instead of 10 * 7 for the
- * number of threads per subslice.
- */
const unsigned subslices = MAX2(physical_device->subslice_total, 1);
- const unsigned scratch_ids_per_subslice =
- device->info.is_haswell ? 16 * 8 : devinfo->max_cs_threads;
+
+ unsigned scratch_ids_per_subslice;
+ if (devinfo->is_haswell) {
+ /* WaCSScratchSize:hsw
+ *
+ * Haswell's scratch space address calculation appears to be sparse
+ * rather than tightly packed. The Thread ID has bits indicating
+ * which subslice, EU within a subslice, and thread within an EU it
+ * is. There's a maximum of two slices and two subslices, so these
+ * can be stored with a single bit. Even though there are only 10 EUs
+ * per subslice, this is stored in 4 bits, so there's an effective
+ * maximum value of 16 EUs. Similarly, although there are only 7
+ * threads per EU, this is stored in a 3 bit number, giving an
+ * effective maximum value of 8 threads per EU.
+ *
+ * This means that we need to use 16 * 8 instead of 10 * 7 for the
+ * number of threads per subslice.
+ */
+ scratch_ids_per_subslice = 16 * 8;
+ } else if (devinfo->is_cherryview) {
+ /* For Cherryview, it appears that the scratch addresses for the 6 EU
+ * devices may still generate compute scratch addresses covering the
+ * same range as 8 EU.
+ */
+ scratch_ids_per_subslice = 8 * 7;
+ } else {
+ scratch_ids_per_subslice = devinfo->max_cs_threads;
+ }
uint32_t max_threads[] = {
[MESA_SHADER_VERTEX] = devinfo->max_vs_threads,
--
2.16.1
More information about the mesa-stable
mailing list