[Mesa-dev] [PATCH v2] i965/vec4: Don't coalesce regs in Gen6 MATH ops if reswizzle/writemask needed
Antia Puentes
apuentes at igalia.com
Tue Sep 22 09:53:36 PDT 2015
Gen6 MATH instructions can not execute in align16 mode, so swizzles or
writemasking are not allowed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033
---
I have tried to find an example where the writemask check is strictly needed
in order to avoid an incorrect register coalescing, but I have failed.
If I have something like this in my shader:
in vec4 attr_pos;
void main() {
vec4 a;
a.xy = sin(attr_pos);
..
}
It is translated into the following vec4 instructions:
sin vgrf4.0:F, vgrf3.xyzw:F
mov vgrf0.0.xy:F, vgrf4.xyzw:F <-- (added by emit_math to fix the writemasking issue)
And converted by the opt_reduce_swizzle optimization into:
sin vgrf4.0:F, vgrf3.xyzw:F
mov vgrf0.0.xy:F, vgrf4.xyyy:F
The swizzle in the MOV instruction is not the identity anymore, so the method "can_reswizzle"
in opt_register_coalesce will return FALSE because of the swizzle-check that my first
version of the patch added.
Out of curiosity, I disabled the opt_reduce_swizzle optimization to see what happens.
Still "can_reswizzle" returns FALSE preventing the incorrect register coalescing,
because the math function sets more things than the MOV instruction; i.e.
the check "(dst.writemask & ~swizzle_mask) is positive.
Although I have not found an example where the writemask-check is crucial, that does not neccessarily
mean that it does not exists. I am sending the second version of the patch in case you prefer to have
that check in place.
src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++-
src/mesa/drivers/dri/i965/brw_vec4.cpp | 11 +++++++++--
2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index 966a410..a48bb68 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -175,7 +175,8 @@ public:
bool is_send_from_grf();
unsigned regs_read(unsigned arg) const;
- bool can_reswizzle(int dst_writemask, int swizzle, int swizzle_mask);
+ bool can_reswizzle(const struct brw_device_info *devinfo, int dst_writemask,
+ int swizzle, int swizzle_mask);
void reswizzle(int dst_writemask, int swizzle);
bool can_do_source_mods(const struct brw_device_info *devinfo);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index ed49cd3..ae695f1 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -941,10 +941,17 @@ vec4_visitor::opt_set_dependency_control()
}
bool
-vec4_instruction::can_reswizzle(int dst_writemask,
+vec4_instruction::can_reswizzle(const struct brw_device_info *devinfo,
+ int dst_writemask,
int swizzle,
int swizzle_mask)
{
+ /* Gen6 MATH instructions can not execute in align16 mode, so swizzles
+ * or writemasking are not allowed. */
+ if (devinfo->gen == 6 && is_math() &&
+ (swizzle != BRW_SWIZZLE_XYZW || dst_writemask != WRITEMASK_XYZW))
+ return false;
+
/* If this instruction sets anything not referenced by swizzle, then we'd
* totally break it when we reswizzle.
*/
@@ -1077,7 +1084,7 @@ vec4_visitor::opt_register_coalesce()
break;
/* If we can't handle the swizzle, bail. */
- if (!scan_inst->can_reswizzle(inst->dst.writemask,
+ if (!scan_inst->can_reswizzle(devinfo, inst->dst.writemask,
inst->src[0].swizzle,
chans_needed)) {
break;
--
2.1.0
More information about the mesa-dev
mailing list