[Mesa-dev] [RFC PATCH 1/5] i965: Add writes_accumulator flag

Matt Turner mattst88 at gmail.com
Fri Mar 21 23:26:59 PDT 2014


On Fri, Mar 21, 2014 at 11:17 PM, Matt Turner <mattst88 at gmail.com> wrote:
> On Fri, Mar 21, 2014 at 7:33 AM, Juha-Pekka Heikkila
> <juhapekka.heikkila at gmail.com> wrote:
>> Our hardware has an "accumulator" register, which can be used to store
>> intermediate results across multiple instructions.  Many instructions
>> can implicitly write a value to the accumulator in addition to their
>> normal destination register.  This is enabled by the "AccWrEn" flag.
>>
>> This patch introduces a new flag, inst->writes_accumulator, which
>> allows us to express the AccWrEn notion in the IR.  It also creates a
>> n ALU2_ACC macro to easily define emitters for instructions that
>> implicitly write the accumulator.
>>
>> Previously, we only supported implicit accumulator writes from the
>> ADDC, SUBB, and MACH instructions.  We always enabled them on those
>> instructions, and left them disabled for other instructions.
>>
>> To take advantage of the MAC (multiply-accumulate) instruction, we
>> need to be able to set AccWrEn on other types of instructions.
>>
>> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila at gmail.com>
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp             | 34 +++++++++++-------------
>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  7 +----
>>  src/mesa/drivers/dri/i965/brw_shader.cpp         |  3 +++
>>  src/mesa/drivers/dri/i965/brw_shader.h           |  1 +
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp           | 15 +----------
>>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  7 +----
>>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 21 +++++++++++----
>>  7 files changed, 39 insertions(+), 49 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 27cf0f6..107d9e6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -64,6 +64,8 @@ fs_inst::init()
>>
>>     /* This will be the case for almost all instructions. */
>>     this->regs_written = 1;
>> +
>> +   this->writes_accumulator = false;
>>  }
>>
>>  fs_inst::fs_inst()
>> @@ -151,6 +153,15 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst,
>>        return new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1);    \
>>     }
>>
>> +#define ALU2_ACC(op)                                                    \
>> +   fs_inst *                                                            \
>> +   fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1)                 \
>> +   {                                                                    \
>> +      fs_inst *inst = new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1);\
>> +      inst->writes_accumulator = true;                                  \
>> +      return inst;                                                      \
>> +   }
>> +
>>  #define ALU3(op)                                                        \
>>     fs_inst *                                                            \
>>     fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1, fs_reg src2)    \
>> @@ -166,7 +177,7 @@ ALU1(RNDE)
>>  ALU1(RNDZ)
>>  ALU2(ADD)
>>  ALU2(MUL)
>> -ALU2(MACH)
>> +ALU2_ACC(MACH)
>>  ALU2(AND)
>>  ALU2(OR)
>>  ALU2(XOR)
>> @@ -182,8 +193,8 @@ ALU1(FBH)
>>  ALU1(FBL)
>>  ALU1(CBIT)
>>  ALU3(MAD)
>> -ALU2(ADDC)
>> -ALU2(SUBB)
>> +ALU2_ACC(ADDC)
>> +ALU2_ACC(SUBB)
>>  ALU2(SEL)
>>
>>  /** Gen4 predicated IF. */
>> @@ -2103,21 +2114,8 @@ fs_visitor::dead_code_eliminate()
>>           }
>>
>>           if (dead) {
>> -            /* Don't dead code eliminate instructions that write to the
>> -             * accumulator as a side-effect. Instead just set the destination
>> -             * to the null register to free it.
>> -             */
>> -            switch (inst->opcode) {
>> -            case BRW_OPCODE_ADDC:
>> -            case BRW_OPCODE_SUBB:
>> -            case BRW_OPCODE_MACH:
>> -               inst->dst = fs_reg(retype(brw_null_reg(), inst->dst.type));
>> -               break;
>> -            default:
>> -               inst->remove();
>> -               progress = true;
>> -               break;
>> -            }
>> +            inst->remove();
>> +            progress = true;
>
> I think you meant something like
>
> if (inst->writes_accumulator) {
>    inst->dst = fs_reg(retype(brw_null_reg(), inst->dst.type));
> } else {
>    inst->remove();
> }
> progress = true;
>
> rather than removing instructions that implicitly wrote the
> accumulator when their destination was dead.

Oh, I see. You added writes_accumulator to has_side_effects(). I don't
think we want to do that. It weakens dead code elimination and
instruction scheduling. Instruction scheduling in particular, because
we treat instructions with side effects as full barriers, preventing
even independent instructions from being scheduled across the barrier.

I don't see anything in the scheduler that handles the accumulator,
but it shouldn't be very difficult to do it properly. Handling it like
we do the flag register (look for last_conditional_mod) would probably
be good.


More information about the mesa-dev mailing list