[Mesa-dev] [PATCH v2 07/20] i965/fs: fix lower SIMD width for IVB/BYT's MOV_INDIRECT

Samuel Iglesias Gonsálvez siglesias at igalia.com
Tue Jan 17 09:49:21 UTC 2017


From: "Juan A. Suarez Romero" <jasuarez at igalia.com>

Previous to Broadwell, we have 8 registers for MOV_INDIRECT.

According to the IVB and HSW PRMs:

"2.When the destination requires two registers and the sources are
 indirect, the sources must use 1x1 regioning mode. In addition, the
 sources must be assembled from GRF registers each accessed by adjacent
 index registers in 1x1 regioning modes."

So for DF instructions the execution size is not limited by the number
of address registers that are available, but by the EU decompression
logic not handling VxH indirect addressing correctly.

This patch limits the SIMD width to 4 in this case.

v2:
- Fix typo (Matt).
- Fix condition (Curro)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez at igalia.com>
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a2ba0fde9fd..c9b6c995dc9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4870,11 +4870,16 @@ get_lowered_simd_width(const struct gen_device_info *devinfo,
    case SHADER_OPCODE_URB_WRITE_SIMD8_MASKED_PER_SLOT:
       return MIN2(8, inst->exec_size);
 
-   case SHADER_OPCODE_MOV_INDIRECT:
-      /* Prior to Broadwell, we only have 8 address subregisters */
+   case SHADER_OPCODE_MOV_INDIRECT: {
+      const unsigned max_size = (devinfo->gen >= 8 ? 2 : 1) * REG_SIZE;
+      /* Prior to Broadwell, we only have 8 address subregisters. In case of
+       * DF instructions in HSW/IVB, the exec_size is limited by the EU
+       * decompression logic not handling VxH indirect addressing correctly.
+       */
       return MIN3(devinfo->gen >= 8 ? 16 : 8,
-                  2 * REG_SIZE / (inst->dst.stride * type_sz(inst->dst.type)),
+                  max_size / (inst->dst.stride * type_sz(inst->dst.type)),
                   inst->exec_size);
+   }
 
    case SHADER_OPCODE_LOAD_PAYLOAD: {
       const unsigned reg_count =
-- 
2.11.0



More information about the mesa-dev mailing list