[Mesa-dev] [PATCH 4/7] i965: Perform basic optimizations on the FIND_LIVE_CHANNEL opcode.

Francisco Jerez currojerez at riseup.net
Thu Apr 30 06:59:49 PDT 2015


Matt Turner <mattst88 at gmail.com> writes:

> On Fri, Feb 20, 2015 at 11:49 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp       | 49 ++++++++++++++++++++++++++++++
>>  src/mesa/drivers/dri/i965/brw_fs.h         |  1 +
>>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  1 +
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp     | 41 +++++++++++++++++++++++++
>>  src/mesa/drivers/dri/i965/brw_vec4.h       |  1 +
>>  src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  1 +
>>  6 files changed, 94 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index d567c2b..4537900 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -2734,6 +2734,54 @@ fs_visitor::compute_to_mrf()
>>  }
>>
>>  /**
>> + * Eliminate FIND_LIVE_CHANNEL instructions occurring outside any control
>> + * flow.  We could probably do better here with some form of divergence
>> + * analysis.
>> + */
>> +bool
>> +fs_visitor::eliminate_find_live_channel()
>> +{
>> +   bool progress = false;
>> +   unsigned depth = 0;
>> +
>> +   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
>> +      switch (inst->opcode) {
>> +      case BRW_OPCODE_IF:
>> +      case BRW_OPCODE_DO:
>> +         depth++;
>> +         break;
>> +
>> +      case BRW_OPCODE_ENDIF:
>> +      case BRW_OPCODE_WHILE:
>> +         depth--;
>> +         break;
>> +
>> +      case FS_OPCODE_DISCARD_JUMP:
>> +         /* This can potentially make control flow non-uniform until the end
>> +          * of the program.
>> +          */
>> +         depth++;
>> +         break;
>> +
>> +      case SHADER_OPCODE_FIND_LIVE_CHANNEL:
>> +         if (depth == 0) {
>> +            inst->opcode = BRW_OPCODE_MOV;
>> +            inst->src[0] = fs_reg(0);
>
> How does this work if we're on a primitive edge and channel zero is
> disabled? Does the EU not actually disable those channels except in
> the framebuffer write or something?
>
Yes, exactly, samples outside the primitive (or samples discarded by the
early depth/stencil test) have the corresponding execution mask bit
enabled, otherwise it wouldn't be possible to calculate derivatives near
the edge of a primitive.  Whole subspans not covered by the primitive or
dropped by early fragment tests are never dispatched to any PS thread,
so channel "0" should always have valid contents except when disabled by
control flow.

>> +            inst->sources = 1;
>> +            inst->force_writemask_all = true;
>> +            progress = true;
>> +         }
>> +         break;
>> +
>> +      default:
>> +         break;
>> +      }
>> +   }
>> +
>> +   return progress;
>> +}
>
> I wouldn't mind if this was just part of opt_algebraic. Less code that
> way and one fewer pass over the instruction list. What do you think,
> Ken?

I shortly considered doing that but finally decided not to because:
 - It's not an algebraic instruction.
 - The optimization is non-local unlike the other fully self-contained
   algebraic optimizations, as it needs to keep track of the control
   flow nesting level.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150430/78ef50d5/attachment.sig>


More information about the mesa-dev mailing list