[Nouveau] [PATCH 4/4] nv50/ir: allow OP_SET to merge with OP_SET_AND/etc as well as a neg

Ilia Mirkin imirkin at alum.mit.edu
Sat May 9 13:07:55 PDT 2015


On Sat, May 9, 2015 at 4:04 PM, Tobias Klausmann
<tobias.johannes.klausmann at mni.thm.de> wrote:
>
>
> On 09.05.2015 07:35, Ilia Mirkin wrote:
>>
>> This covers the pattern where a KILL_IF is used, which triggers a
>> comparison of -x to 0. This can usually be folded into the comparison
>> whose
>> result is being compared to 0, however it may, itself, have already been
>> combined with another comparison. That shouldn't impact the logic of
>> this pass however. With this and the & 1.0 change, code like
>>
>> 00000020: 001c0001 80081df4     set b32 $r0 lt f32 $r0 0x3e800000
>> 00000028: 001c0000 201fc000     and b32 $r0 $r0 0x3f800000
>> 00000030: 7f9c001e dd885c00     set $p0 0x1 lt f32 neg $r0 0x0
>> 00000038: 0000003c 19800000     $p0 discard
>>
>> becomes
>>
>> 00000020: 001c001d b5881df4     set $p0 0x1 lt f32 $r0 0x3e800000
>> 00000028: 0000003c 19800000     $p0 discard
>>
>> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
>> ---
>>   .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 51
>> ++++++++++++++--------
>>   1 file changed, 33 insertions(+), 18 deletions(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> index d8af19a..43a2fe9 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> @@ -278,7 +278,6 @@ private:
>>        void tryCollapseChainedMULs(Instruction *, const int s,
>> ImmediateValue&);
>>   -   // TGSI 'true' is converted to -1 by F2I(NEG(SET)), track back to
>> SET
>>      CmpInstruction *findOriginForTestWithZero(Value *);
>>        unsigned int foldCount;
>> @@ -337,16 +336,10 @@ ConstantFolding::findOriginForTestWithZero(Value
>> *value)
>>         return NULL;
>>      Instruction *insn = value->getInsn();
>>   -   while (insn && insn->op != OP_SET) {
>> +   while (insn && insn->op != OP_SET && insn->op != OP_SET_AND &&
>> +          insn->op != OP_SET_OR && insn->op != OP_SET_XOR) {
>>         Instruction *next = NULL;
>>         switch (insn->op) {
>> -      case OP_NEG:
>> -      case OP_ABS:
>> -      case OP_CVT:
>> -         next = insn->getSrc(0)->getInsn();
>> -         if (insn->sType != next->dType)
>> -            return NULL;
>> -         break;
>>         case OP_MOV:
>>            next = insn->getSrc(0)->getInsn();
>>            break;
>> @@ -946,29 +939,51 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue
>> &imm0, int s)
>>        case OP_SET: // TODO: SET_AND,OR,XOR
>
>
> delete this comment?!

No... this still only handles OP_SET.

>
>
>>      {
>> +      /* This optimizes the case where the output of a set is being
>> compared
>> +       * to zero. Since the set can only produce 0/-1 (int) or 0/1
>> (float), we
>> +       * can be a lot cleverer in our comparison.
>> +       */
>>         CmpInstruction *si = findOriginForTestWithZero(i->getSrc(t));
>>         CondCode cc, ccZ;
>> -      if (i->src(t).mod != Modifier(0))
>> -         return;
>> -      if (imm0.reg.data.u32 != 0 || !si || si->op != OP_SET)
>> +      if (imm0.reg.data.u32 != 0 || !si)
>>            return;
>>         cc = si->setCond;
>>         ccZ = (CondCode)((unsigned int)i->asCmp()->setCond & ~CC_U);
>> +      // We do everything assuming var (cmp) 0, reverse the condition if
>> 0 is
>> +      // first.
>>         if (s == 0)
>>            ccZ = reverseCondCode(ccZ);
>> +      // If there is a negative modifier, we need to undo that, by
>> flipping
>> +      // the comparison to zero.
>> +      if (i->src(t).mod.neg())
>> +         ccZ = reverseCondCode(ccZ);
>> +      // If this is a signed comparison, we expect the input to be a
>> regular
>> +      // boolean, i.e. 0/-1. However the rest of the logic assumes that
>> true
>> +      // is positive, so just flip the sign.
>> +      if (i->sType == TYPE_S32) {
>> +         assert(!isFloatType(si->dType));
>> +         ccZ = reverseCondCode(ccZ);
>> +      }
>
>
> can both this and the previous condition evaluate to true? if yes, this
> double-flips ccZ...

Triple, in fact :) Each thing causes a direction flip... I guess I
should just sum them up, and flip it if it's odd, but... wtvr. This
seems more straightforward.

Actually both this and the previous commit required a bit more work,
I'll send v2's, but the latest versions are at:

https://github.com/imirkin/mesa/commits/tmp4

>
>
>>         switch (ccZ) {
>> -      case CC_LT: cc = CC_FL; break;
>> -      case CC_GE: cc = CC_TR; break;
>> -      case CC_EQ: cc = inverseCondCode(cc); break;
>> -      case CC_LE: cc = inverseCondCode(cc); break;
>> -      case CC_GT: break;
>> -      case CC_NE: break;
>> +      case CC_LT: cc = CC_FL; break; // bool < 0 -- this is never true
>> +      case CC_GE: cc = CC_TR; break; // bool >= 0 -- this is always true
>> +      case CC_EQ: cc = inverseCondCode(cc); break; // bool == 0 -- !bool
>> +      case CC_LE: cc = inverseCondCode(cc); break; // bool <= 0 -- !bool
>> +      case CC_GT: break; // bool > 0 -- bool
>> +      case CC_NE: break; // bool != 0 -- bool
>>         default:
>>            return;
>>         }
>> +
>> +      // Update the condition of this SET to be identical to the origin
>> set,
>> +      // but with the updated condition code. The original SET should get
>> +      // DCE'd, ideally.
>> +      i->op = si->op;
>>         i->asCmp()->setCond = cc;
>>         i->setSrc(0, si->src(0));
>>         i->setSrc(1, si->src(1));
>> +      if (si->srcExists(2))
>> +         i->setSrc(2, si->src(2));
>>         i->sType = si->sType;
>>      }
>>         break;
>
>


More information about the Nouveau mailing list