[Mesa-dev] [PATCH] i965/vec4: remove the generator hack for dual instanced GS

Iago Toral Quiroga itoral at igalia.com
Fri Aug 26 09:54:54 UTC 2016


This hack was introduced in commit 03ac2c7223f7645e3:
i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs

Specifically to fixup the code we emitted to deal with gl_PointSize inputs
in dual instance mode, where we were emitting a MOV to copy the point
size from .w (where the hardware delivers it) to .x (because code will
expect this to be a float). This meant that we were emitting a MOV
to an ATTR destination that could have a width of 4 (in dual instanced
mode) so it was necessary to fix the execution size and regioning of the
instruction.

Fortunately, Ken fixed this in 67c5d00273ca2:
i965/vec4/gs: Stop munging the ATTR containing gl_PointSize.

by using a WWWW swizzle instead of a MOV, and as the commit log in that
patch states, we no longer emit instructions with ATTR destinations, so
that makes the fixup code in the generator unnecessary.

---

We tested this on the following platforms without regressions in piglit:

- SandyBridge (full piglit run with our fp64 branch)
- IvyBridge (full piglit run with master)
- Haswell (shader.py with master)
- Haswell (shader.py and forcing dual-instanced mode for GS with master)

It would still probably be a good idea to run it through Jenkins just in case.

---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 31 ------------------------
 1 file changed, 31 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 7ad4f86..128eb2d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -1499,34 +1499,6 @@ generate_code(struct brw_codegen *p,
       assert(inst->mlen <= BRW_MAX_MSG_LENGTH);
 
       unsigned pre_emit_nr_insn = p->nr_insn;
-      bool fix_exec_size = false;
-
-      if (dst.width == BRW_WIDTH_4) {
-         /* This happens in attribute fixups for "dual instanced" geometry
-          * shaders, since they use attributes that are vec4's.  Since the exec
-          * width is only 4, it's essential that the caller set
-          * force_writemask_all in order to make sure the instruction is executed
-          * regardless of which channels are enabled.
-          */
-         assert(inst->force_writemask_all);
-
-         /* Fix up any <8;8,1> or <0;4,1> source registers to <4;4,1> to satisfy
-          * the following register region restrictions (from Graphics BSpec:
-          * 3D-Media-GPGPU Engine > EU Overview > Registers and Register Regions
-          * > Register Region Restrictions)
-          *
-          *     1. ExecSize must be greater than or equal to Width.
-          *
-          *     2. If ExecSize = Width and HorzStride != 0, VertStride must be set
-          *        to Width * HorzStride."
-          */
-         for (int i = 0; i < 3; i++) {
-            if (src[i].file == BRW_GENERAL_REGISTER_FILE)
-               src[i] = stride(src[i], 4, 4, 1);
-         }
-         brw_set_default_exec_size(p, BRW_EXECUTE_4);
-         fix_exec_size = true;
-      }
 
       switch (inst->opcode) {
       case VEC4_OPCODE_UNPACK_UNIFORM:
@@ -2028,9 +2000,6 @@ generate_code(struct brw_codegen *p,
          unreachable("Unsupported opcode");
       }
 
-      if (fix_exec_size)
-         brw_set_default_exec_size(p, BRW_EXECUTE_8);
-
       if (inst->opcode == VEC4_OPCODE_PACK_BYTES) {
          /* Handled dependency hints in the generator. */
 
-- 
2.7.4



More information about the mesa-dev mailing list