[Mesa-dev] [PATCH] i965/fs: Update discard jump to preserve uniform loads via sampler.

Cody Northrop cody at lunarg.com
Tue Jul 1 11:43:54 PDT 2014

Commit 17c7ead7 exposed a bug in how uniform loading happens in the
presence of discard.  It manifested itself in an application as
randomly incorrect pixels on the borders of conditional areas.

This is due to how discards jump to the end of the shader incorrectly
for some channels.  The current implementation checks each 2x2
subspan to preserve derivatives.  When uniform loading via samplers
was turned on, it uses a full execution mask, as stated in
lower_uniform_pull_constant_loads(), and only populates four channels
of the destination (see generate_uniform_pull_constant_load_gen7()).
It happens incorrectly when the first subspan has been jumped over.

The series that implemented this optimization was done before the
changes to use samplers for uniform loads.  Uniform sampler loads
use special execution masks and only populate four channels, so we
can't jump over those or corruption ensues.

This fix only jumps to the end of the shader if all relevant channels
are disabled, i.e. all 8 or 16, depending on dispatch.  This
preserves the original GLbenchmark 2.7 speedup noted in commit

It changes the shader assembly accordingly:

before   : (-f0.1.any4h)  halt(8)  17 2  null { align1 WE_all 1Q };
after(8) : (-f0.1.any8h)  halt(8)  17 2  null { align1 WE_all 1Q };
after(16): (-f0.1.any16h) halt(16) 17 2  null { align1 WE_all 1H };

v2:  Cleaned up comments and conditional ordering.

Signed-off-by: Cody Northrop <cody at lunarg.com>
Reviewed-by: Mike Stroyan <mike at lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79948
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 654f5fe..3516aca 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1929,15 +1929,14 @@ fs_visitor::visit(ir_discard *ir)
    if (brw->gen >= 6) {
       /* For performance, after a discard, jump to the end of the shader.
-       * However, many people will do foliage by discarding based on a
-       * texture's alpha mask, and then continue on to texture with the
-       * remaining pixels.  To avoid trashing the derivatives for those
-       * texture samples, we'll only jump if all of the pixels in the subspan
-       * have been discarded.
+       * Only jump if all relavant channels have been discarded.
       fs_inst *discard_jump = emit(FS_OPCODE_DISCARD_JUMP);
       discard_jump->flag_subreg = 1;
-      discard_jump->predicate = BRW_PREDICATE_ALIGN1_ANY4H;
+      discard_jump->predicate = (dispatch_width == 8)
+                                ? BRW_PREDICATE_ALIGN1_ANY8H
+                                : BRW_PREDICATE_ALIGN1_ANY16H;
       discard_jump->predicate_inverse = true;

More information about the mesa-dev mailing list