[Mesa-dev] [PATCH] nv50/ir: make sure to erase src2 after optimizing to MOV

Ilia Mirkin imirkin at alum.mit.edu
Wed Nov 9 15:19:26 UTC 2016


On Wed, Nov 9, 2016 at 10:10 AM, Samuel Pitoiset
<samuel.pitoiset at gmail.com> wrote:
>
>
> On 11/09/2016 03:58 PM, Ilia Mirkin wrote:
>>
>> On Wed, Nov 9, 2016 at 9:20 AM, Samuel Pitoiset
>> <samuel.pitoiset at gmail.com> wrote:
>>>
>>> For all instructions with 3 sources (like OP_SLCT), src2 needs
>>> to be destroyed because srcExists(2) will return true although
>>> it's actually undefined.
>>>
>>> Spotted with my ADD3 series.
>>
>>
>> Sounds like the ADD3 series' addition to this function doesn't handle
>> this scenario. It's already properly handled by FMA/MAD, for example.
>
>
> Nope, it's correctly handled in expr() in my series.
>
> In ConstantFolding::visit(BasicBlock *bb), I added a new opnd2() method
> which is used for folding ADD3(a,b,c)-> ADD3(b,a+c) or ->ADD3(a,b+c) because
> the pass doesn't currently implement such a case.
>
> Though, opnd2() is called if srcExists(2) is true and if src0/src1 or
> src0/src2 are immediate values. But as this is done after the opnd3() call,
> if the instruction has been constant folded (just before) but src2 not
> erased, it will try to get the immediates and crash... which is expected.
>
> This just makes expr() more robust against that, and it's not entirely
> related to my series, because if you try to detect more scenarios where
> immediates can be folded *after* opnd3() you will hit the issue for sure.
>
> src1 is already erased in expr(), I don't see why src2 should not be as well
> (when optimizing to SAT/MOV).

src2 is also erased.

   switch (i->op) {
   case OP_MAD:
   case OP_FMA: {
...
      i->setSrc(2, NULL);
  }

You need to add a case for ADD3 here that does the same thing. And
also immediate re-executes expr/opnd in case the other arg was *also*
an immed. That should avoid taking up an extra fold iteration on it.

>
>
>>
>>>
>>> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
>>> ---
>>>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> index 9cf6ddc..ef31436 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> @@ -772,6 +772,7 @@ ConstantFolding::expr(Instruction *i,
>>>        break;
>>>     default:
>>>        i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary()
>>> */
>>> +      i->setSrc(2, NULL);
>>>        break;
>>>     }
>>>     i->subOp = 0;
>>> --
>>> 2.10.2
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
> --
> -Samuel


More information about the mesa-dev mailing list