Mesa (master): i965/vec4: fix register width for DF VGRF and UNIFORM

Wed May 3 14:22:13 UTC 2017

Module: Mesa
Branch: master
Commit: aaeb1c99beed39d85c300ebdb8a7bf056ee6717c
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=aaeb1c99beed39d85c300ebdb8a7bf056ee6717c

Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Tue Apr 25 12:18:17 2017 +0200

i965/vec4: fix register width for DF VGRF and UNIFORM

On gen7, the swizzles used in DF align16 instructions works for element
size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that
in the rest of the code and prepare the instructions for this (scalarize_df()),
we need to set it to two again.

However, for DF align1 instructions, a width of 2 is wrong as we are not
reading the data we want. For example, an uniform would have a region of
<0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access
to the first 4.

This patch sets the default one to 4 and then modifies the width of
align16 instruction's DF sources when we translate the logical swizzle
to the physical one.

v2:
- Remove conditional (Curro).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Cc: "17.1" <mesa-stable at lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez at riseup.net>

---

 src/intel/compiler/brw_vec4.cpp | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index e27be8fc25..70487d3c15 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -1975,9 +1975,7 @@ vec4_visitor::convert_to_hw_regs()
          struct brw_reg reg;
          switch (src.file) {
          case VGRF: {
-            const unsigned type_size = type_sz(src.type);
-            const unsigned width = REG_SIZE / 2 / MAX2(4, type_size);
-            reg = byte_offset(brw_vecn_grf(width, src.nr, 0), src.offset);
+            reg = byte_offset(brw_vecn_grf(4, src.nr, 0), src.offset);
             reg.type = src.type;
             reg.abs = src.abs;
             reg.negate = src.negate;
@@ -1985,12 +1983,11 @@ vec4_visitor::convert_to_hw_regs()
          }
 
          case UNIFORM: {
-            const unsigned width = REG_SIZE / 2 / MAX2(4, type_sz(src.type));
             reg = stride(byte_offset(brw_vec4_grf(
                                         prog_data->base.dispatch_grf_start_reg +
                                         src.nr / 2, src.nr % 2 * 4),
                                      src.offset),
-                         0, width, 1);
+                         0, 4, 1);
             reg.type = src.type;
             reg.abs = src.abs;
             reg.negate = src.negate;
@@ -2527,6 +2524,11 @@ vec4_visitor::apply_logical_swizzle(struct brw_reg *hw_reg,
    assert(brw_is_single_value_swizzle(reg.swizzle) ||
           is_supported_64bit_region(inst, arg));
 
+   /* Apply the region <2, 2, 1> for GRF or <0, 2, 1> for uniforms, as align16
+    * HW can only do 32-bit swizzle channels.
+    */
+   hw_reg->width = BRW_WIDTH_2;
+
    if (is_supported_64bit_region(inst, arg) &&
        !is_gen7_supported_64bit_swizzle(inst, arg)) {
       /* Supported 64-bit swizzles are those such that their first two