[Mesa-dev] [PATCH] st/mesa: remove copy-propagation pass

Brian Paul brianp at vmware.com
Wed Mar 25 12:21:18 PDT 2015


The problem is, our binary shader interface only supports 32 temps at 
this time.  We sometimes bump into that limit as-is.

-Brian

On 03/25/2015 01:04 PM, Ilia Mirkin wrote:
> Yes, more temp registers and more instructions. But presumably the
> backend has an optimization pass that is at least as good as this one
> (hopefully better!). Is that not the case for vmware?
>
> On Wed, Mar 25, 2015 at 2:59 PM, Brian Paul <brianp at vmware.com> wrote:
>> Will removing this pass have much effect on the number of temp regs used?
>> It looks like more instructions may be emitted w/out this pass.
>>
>> We're kind of sensitive to that in the VMware driver.
>>
>> -Brian
>>
>> On 03/25/2015 12:16 PM, Marek Olšák wrote:
>>>
>>> Reviewed-by: Marek Olšák <marek.olsak at amd.com>
>>>
>>> I might need to wait for other people's opinion too.
>>>
>>> Marek
>>>
>>> On Wed, Mar 25, 2015 at 6:34 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>>>
>>>> It's buggy and unnecessary in the presence of optimizing backends. The
>>>> only backend that will suffer is nv30, but... meh.
>>>>
>>>> Bugzilla:
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D89759&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=hW65RavQ_Xuvw96f61daCkas_SjeEudtADNX3BzgNQU&s=zjWC0LOuYp8NH6K072ITDgPYCCE0F_a_LCdd9zrdrhA&e=
>>>>
>>>> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
>>>> ---
>>>>    src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 199
>>>> -----------------------------
>>>>    1 file changed, 199 deletions(-)
>>>>
>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> index efee4b2..0402ce3 100644
>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> @@ -461,7 +461,6 @@ public:
>>>>       int get_last_temp_read(int index);
>>>>       int get_last_temp_write(int index);
>>>>
>>>> -   void copy_propagate(void);
>>>>       int eliminate_dead_code(void);
>>>>
>>>>       void merge_two_dsts(void);
>>>> @@ -3757,203 +3756,6 @@ glsl_to_tgsi_visitor::get_last_temp_write(int
>>>> index)
>>>>    }
>>>>
>>>>    /*
>>>> - * On a basic block basis, tracks available PROGRAM_TEMPORARY register
>>>> - * channels for copy propagation and updates following instructions to
>>>> - * use the original versions.
>>>> - *
>>>> - * The glsl_to_tgsi_visitor lazily produces code assuming that this pass
>>>> - * will occur.  As an example, a TXP production before this pass:
>>>> - *
>>>> - * 0: MOV TEMP[1], INPUT[4].xyyy;
>>>> - * 1: MOV TEMP[1].w, INPUT[4].wwww;
>>>> - * 2: TXP TEMP[2], TEMP[1], texture[0], 2D;
>>>> - *
>>>> - * and after:
>>>> - *
>>>> - * 0: MOV TEMP[1], INPUT[4].xyyy;
>>>> - * 1: MOV TEMP[1].w, INPUT[4].wwww;
>>>> - * 2: TXP TEMP[2], INPUT[4].xyyw, texture[0], 2D;
>>>> - *
>>>> - * which allows for dead code elimination on TEMP[1]'s writes.
>>>> - */
>>>> -void
>>>> -glsl_to_tgsi_visitor::copy_propagate(void)
>>>> -{
>>>> -   glsl_to_tgsi_instruction **acp = rzalloc_array(mem_ctx,
>>>> -
>>>> glsl_to_tgsi_instruction *,
>>>> -                                                  this->next_temp * 4);
>>>> -   int *acp_level = rzalloc_array(mem_ctx, int, this->next_temp * 4);
>>>> -   int level = 0;
>>>> -
>>>> -   foreach_in_list(glsl_to_tgsi_instruction, inst, &this->instructions)
>>>> {
>>>> -      assert(inst->dst[0].file != PROGRAM_TEMPORARY
>>>> -             || inst->dst[0].index < this->next_temp);
>>>> -
>>>> -      /* First, do any copy propagation possible into the src regs. */
>>>> -      for (int r = 0; r < 3; r++) {
>>>> -         glsl_to_tgsi_instruction *first = NULL;
>>>> -         bool good = true;
>>>> -         int acp_base = inst->src[r].index * 4;
>>>> -
>>>> -         if (inst->src[r].file != PROGRAM_TEMPORARY ||
>>>> -             inst->src[r].reladdr ||
>>>> -             inst->src[r].reladdr2)
>>>> -            continue;
>>>> -
>>>> -         /* See if we can find entries in the ACP consisting of MOVs
>>>> -          * from the same src register for all the swizzled channels
>>>> -          * of this src register reference.
>>>> -          */
>>>> -         for (int i = 0; i < 4; i++) {
>>>> -            int src_chan = GET_SWZ(inst->src[r].swizzle, i);
>>>> -            glsl_to_tgsi_instruction *copy_chan = acp[acp_base +
>>>> src_chan];
>>>> -
>>>> -            if (!copy_chan) {
>>>> -               good = false;
>>>> -               break;
>>>> -            }
>>>> -
>>>> -            assert(acp_level[acp_base + src_chan] <= level);
>>>> -
>>>> -            if (!first) {
>>>> -               first = copy_chan;
>>>> -            } else {
>>>> -               if (first->src[0].file != copy_chan->src[0].file ||
>>>> -                   first->src[0].index != copy_chan->src[0].index ||
>>>> -                   first->src[0].index2D != copy_chan->src[0].index2D) {
>>>> -                  good = false;
>>>> -                  break;
>>>> -               }
>>>> -            }
>>>> -         }
>>>> -
>>>> -         if (good) {
>>>> -            /* We've now validated that we can copy-propagate to
>>>> -             * replace this src register reference.  Do it.
>>>> -             */
>>>> -            inst->src[r].file = first->src[0].file;
>>>> -            inst->src[r].index = first->src[0].index;
>>>> -            inst->src[r].index2D = first->src[0].index2D;
>>>> -            inst->src[r].has_index2 = first->src[0].has_index2;
>>>> -
>>>> -            int swizzle = 0;
>>>> -            for (int i = 0; i < 4; i++) {
>>>> -               int src_chan = GET_SWZ(inst->src[r].swizzle, i);
>>>> -               glsl_to_tgsi_instruction *copy_inst = acp[acp_base +
>>>> src_chan];
>>>> -               swizzle |= (GET_SWZ(copy_inst->src[0].swizzle, src_chan)
>>>> << (3 * i));
>>>> -            }
>>>> -            inst->src[r].swizzle = swizzle;
>>>> -         }
>>>> -      }
>>>> -
>>>> -      switch (inst->op) {
>>>> -      case TGSI_OPCODE_BGNLOOP:
>>>> -      case TGSI_OPCODE_ENDLOOP:
>>>> -         /* End of a basic block, clear the ACP entirely. */
>>>> -         memset(acp, 0, sizeof(*acp) * this->next_temp * 4);
>>>> -         break;
>>>> -
>>>> -      case TGSI_OPCODE_IF:
>>>> -      case TGSI_OPCODE_UIF:
>>>> -         ++level;
>>>> -         break;
>>>> -
>>>> -      case TGSI_OPCODE_ENDIF:
>>>> -      case TGSI_OPCODE_ELSE:
>>>> -         /* Clear all channels written inside the block from the ACP,
>>>> but
>>>> -          * leaving those that were not touched.
>>>> -          */
>>>> -         for (int r = 0; r < this->next_temp; r++) {
>>>> -            for (int c = 0; c < 4; c++) {
>>>> -               if (!acp[4 * r + c])
>>>> -                  continue;
>>>> -
>>>> -               if (acp_level[4 * r + c] >= level)
>>>> -                  acp[4 * r + c] = NULL;
>>>> -            }
>>>> -         }
>>>> -         if (inst->op == TGSI_OPCODE_ENDIF)
>>>> -            --level;
>>>> -         break;
>>>> -
>>>> -      default:
>>>> -         /* Continuing the block, clear any written channels from
>>>> -          * the ACP.
>>>> -          */
>>>> -         for (int d = 0; d < 2; d++) {
>>>> -            if (inst->dst[d].file == PROGRAM_TEMPORARY &&
>>>> inst->dst[d].reladdr) {
>>>> -               /* Any temporary might be written, so no copy propagation
>>>> -                * across this instruction.
>>>> -                */
>>>> -               memset(acp, 0, sizeof(*acp) * this->next_temp * 4);
>>>> -            } else if (inst->dst[d].file == PROGRAM_OUTPUT &&
>>>> -                       inst->dst[d].reladdr) {
>>>> -               /* Any output might be written, so no copy propagation
>>>> -                * from outputs across this instruction.
>>>> -                */
>>>> -               for (int r = 0; r < this->next_temp; r++) {
>>>> -                  for (int c = 0; c < 4; c++) {
>>>> -                     if (!acp[4 * r + c])
>>>> -                        continue;
>>>> -
>>>> -                     if (acp[4 * r + c]->src[0].file == PROGRAM_OUTPUT)
>>>> -                        acp[4 * r + c] = NULL;
>>>> -                  }
>>>> -               }
>>>> -            } else if (inst->dst[d].file == PROGRAM_TEMPORARY ||
>>>> -                       inst->dst[d].file == PROGRAM_OUTPUT) {
>>>> -               /* Clear where it's used as dst. */
>>>> -               if (inst->dst[d].file == PROGRAM_TEMPORARY) {
>>>> -                  for (int c = 0; c < 4; c++) {
>>>> -                     if (inst->dst[d].writemask & (1 << c))
>>>> -                        acp[4 * inst->dst[d].index + c] = NULL;
>>>> -                  }
>>>> -               }
>>>> -
>>>> -               /* Clear where it's used as src. */
>>>> -               for (int r = 0; r < this->next_temp; r++) {
>>>> -                  for (int c = 0; c < 4; c++) {
>>>> -                     if (!acp[4 * r + c])
>>>> -                        continue;
>>>> -
>>>> -                     int src_chan = GET_SWZ(acp[4 * r +
>>>> c]->src[0].swizzle, c);
>>>> -
>>>> -                     if (acp[4 * r + c]->src[0].file ==
>>>> inst->dst[d].file &&
>>>> -                         acp[4 * r + c]->src[0].index ==
>>>> inst->dst[d].index &&
>>>> -                         inst->dst[d].writemask & (1 << src_chan)) {
>>>> -                        acp[4 * r + c] = NULL;
>>>> -                     }
>>>> -                  }
>>>> -               }
>>>> -            }
>>>> -         }
>>>> -         break;
>>>> -      }
>>>> -
>>>> -      /* If this is a copy, add it to the ACP. */
>>>> -      if (inst->op == TGSI_OPCODE_MOV &&
>>>> -          inst->dst[0].file == PROGRAM_TEMPORARY &&
>>>> -          !(inst->dst[0].file == inst->src[0].file &&
>>>> -             inst->dst[0].index == inst->src[0].index) &&
>>>> -          !inst->dst[0].reladdr &&
>>>> -          !inst->saturate &&
>>>> -          !inst->src[0].reladdr &&
>>>> -          !inst->src[0].reladdr2 &&
>>>> -          !inst->src[0].negate) {
>>>> -         for (int i = 0; i < 4; i++) {
>>>> -            if (inst->dst[0].writemask & (1 << i)) {
>>>> -               acp[4 * inst->dst[0].index + i] = inst;
>>>> -               acp_level[4 * inst->dst[0].index + i] = level;
>>>> -            }
>>>> -         }
>>>> -      }
>>>> -   }
>>>> -
>>>> -   ralloc_free(acp_level);
>>>> -   ralloc_free(acp);
>>>> -}
>>>> -
>>>> -/*
>>>>     * On a basic block basis, tracks available PROGRAM_TEMPORARY registers
>>>> for dead
>>>>     * code elimination.
>>>>     *
>>>> @@ -5623,7 +5425,6 @@ get_mesa_program(struct gl_context *ctx,
>>>>
>>>>       /* Perform optimizations on the instructions in the
>>>> glsl_to_tgsi_visitor. */
>>>>       v->simplify_cmp();
>>>> -   v->copy_propagate();
>>>>       while (v->eliminate_dead_code());
>>>>
>>>>       v->merge_two_dsts();
>>>> --
>>>> 2.0.5
>>>>
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>>
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=hW65RavQ_Xuvw96f61daCkas_SjeEudtADNX3BzgNQU&s=2ypQDTjgA1t1k9zgBXxsw9iSjhlz3Mta_iyE4dy07mg&e=
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=hW65RavQ_Xuvw96f61daCkas_SjeEudtADNX3BzgNQU&s=2ypQDTjgA1t1k9zgBXxsw9iSjhlz3Mta_iyE4dy07mg&e=
>>>
>>



More information about the mesa-dev mailing list