[Mesa-dev] [PATCH 73/95] i965/vec4: set force_vstride0 on any 64-bit source that has subnr > 0

Iago Toral Quiroga itoral at igalia.com
Tue Jul 19 10:41:10 UTC 2016


From: Samuel Iglesias Gonsálvez <siglesias at igalia.com>

Sometimes we emit code that has subnr > 0 to select the second half
of a DF register (components Z or W). For example, the 64-bit
shuffling code does this. For that code to work properly we need to
make sure that that we use a vstride=0 on these source registers too
(thus, it should set the flag force_vstride0 on the source).

Instead of always having to remember that we need to force the vstride
to 0 in these cases it is better if we just do this here together with
the other cases where we need we set this flag. This way there is only
one place in the driver where we handle this.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Signed-off-by: Iago Toral Quiroga <itoral at igalia.com>
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 1332d96..9672b2c 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -2298,7 +2298,6 @@ vec4_visitor::expand_64bit_swizzle_to_32bit()
                /* Subnr must be in units of bytes for FIXED_GRF */
                if (inst->src[arg].file == FIXED_GRF)
                   inst->src[arg].subnr *= type_sz(inst->src[arg].type);
-               inst->src[arg].force_vstride0 = true;
             } else {
                inst->src[arg].reg_offset += 1;
             }
@@ -2311,6 +2310,18 @@ vec4_visitor::expand_64bit_swizzle_to_32bit()
                inst->src[arg].force_vstride0 = true;
             }
          }
+
+         /* Any DF source with a subnr > 0 is intended to address the second
+          * half of a register and needs a vertical stride of 0 so we:
+          *
+          * 1. Don't violate register region restrictions, when execsize > 2
+          *    (we only use exec sizes of 4 and 8, so always)
+          * 2. Activate the gen7 instruction decompresion bug exploit, when
+          *    execsize == 8.
+          */
+         if (inst->src[arg].subnr)
+            inst->src[arg].force_vstride0 = true;
+
          inst->src[arg].swizzle = BRW_SWIZZLE4(swizzle * 2, swizzle * 2 + 1,
                                                swizzle * 2, swizzle * 2 + 1);
          progress = true;
-- 
2.7.4



More information about the mesa-dev mailing list