[Mesa-dev] [PATCH 11/12] i965/vec4: Eliminate writes that are never read.

Matt Turner mattst88 at gmail.com
Mon Mar 24 11:06:51 PDT 2014


On Thu, Mar 20, 2014 at 12:28 PM, Eric Anholt <eric at anholt.net> wrote:
> Matt Turner <mattst88 at gmail.com> writes:
>
>> With an awful O(n^2) algorithm that searches previous instructions for
>> dead writes.
>>
>> total instructions in shared programs: 805582 -> 788074 (-2.17%)
>> instructions in affected programs:     144561 -> 127053 (-12.11%)
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 46 ++++++++++++++++++++++++++++++++++
>>  1 file changed, 46 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index 4ad398a..e9219a9 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -369,6 +369,7 @@ bool
>>  vec4_visitor::dead_code_eliminate()
>>  {
>>     bool progress = false;
>> +   bool seen_control_flow = false;
>>     int pc = -1;
>>
>>     calculate_live_intervals();
>> @@ -378,6 +379,8 @@ vec4_visitor::dead_code_eliminate()
>>
>>        pc++;
>>
>> +      seen_control_flow = inst->is_control_flow() || seen_control_flow;
>> +
>
> So, once there's control flow in the program, this piece of optimization
> doesn't happen ever after it?  Seems like in the walk backwards you
> could just stop the walk when you find a control flow instruction.

That's a good idea. I'll try to implement that today.

>>        if (inst->dst.file != GRF || inst->has_side_effects())
>>           continue;
>>
>> @@ -393,6 +396,49 @@ vec4_visitor::dead_code_eliminate()
>>        }
>>
>>        progress = try_eliminate_instruction(inst, write_mask) || progress;
>> +
>> +      if (seen_control_flow || inst->predicate || inst->prev == NULL)
>> +         continue;
>> +
>> +      int dead_channels = inst->dst.writemask;
>> +
>> +      for (int i = 0; i < 3; i++) {
>> +         if (inst->src[i].file != GRF ||
>> +             inst->src[i].reg != inst->dst.reg)
>> +               continue;
>> +
>> +         for (int j = 0; j < 4; j++) {
>> +            int swiz = BRW_GET_SWZ(inst->src[i].swizzle, j);
>> +            dead_channels &= ~(1 << swiz);
>> +         }
>> +      }
>> +
>> +      for (exec_node *node = inst->prev, *prev = node->prev;
>> +           prev != NULL && dead_channels != 0;
>> +           node = prev, prev = prev->prev) {
>
> You could potentially terminate the loop when you're out of the live
> range of the dst, which would reduce the pain of n^2.

Also a good idea. I thought about this a little and punted because I'd
have to modify the ip value when previous instructions were removed,
but that should be a pretty simple change. I'll try to do that too.

Thanks for the reviews and the ideas.


More information about the mesa-dev mailing list