[Mesa-dev] [PATCH] nv50/ir: make sure to erase src2 after optimizing to MOV

Samuel Pitoiset samuel.pitoiset at gmail.com
Wed Nov 9 15:35:58 UTC 2016



On 11/09/2016 04:19 PM, Ilia Mirkin wrote:
> On Wed, Nov 9, 2016 at 10:10 AM, Samuel Pitoiset
> <samuel.pitoiset at gmail.com> wrote:
>>
>>
>> On 11/09/2016 03:58 PM, Ilia Mirkin wrote:
>>>
>>> On Wed, Nov 9, 2016 at 9:20 AM, Samuel Pitoiset
>>> <samuel.pitoiset at gmail.com> wrote:
>>>>
>>>> For all instructions with 3 sources (like OP_SLCT), src2 needs
>>>> to be destroyed because srcExists(2) will return true although
>>>> it's actually undefined.
>>>>
>>>> Spotted with my ADD3 series.
>>>
>>>
>>> Sounds like the ADD3 series' addition to this function doesn't handle
>>> this scenario. It's already properly handled by FMA/MAD, for example.
>>
>>
>> Nope, it's correctly handled in expr() in my series.
>>
>> In ConstantFolding::visit(BasicBlock *bb), I added a new opnd2() method
>> which is used for folding ADD3(a,b,c)-> ADD3(b,a+c) or ->ADD3(a,b+c) because
>> the pass doesn't currently implement such a case.
>>
>> Though, opnd2() is called if srcExists(2) is true and if src0/src1 or
>> src0/src2 are immediate values. But as this is done after the opnd3() call,
>> if the instruction has been constant folded (just before) but src2 not
>> erased, it will try to get the immediates and crash... which is expected.
>>
>> This just makes expr() more robust against that, and it's not entirely
>> related to my series, because if you try to detect more scenarios where
>> immediates can be folded *after* opnd3() you will hit the issue for sure.
>>
>> src1 is already erased in expr(), I don't see why src2 should not be as well
>> (when optimizing to SAT/MOV).
>
> src2 is also erased.
>
>    switch (i->op) {
>    case OP_MAD:
>    case OP_FMA: {
> ...
>       i->setSrc(2, NULL);
>   }
>
> You need to add a case for ADD3 here that does the same thing. And
> also immediate re-executes expr/opnd in case the other arg was *also*
> an immed. That should avoid taking up an extra fold iteration on it.
>

I know, I already added all the dance there. Let me explain again. :-)

I found the crash with OP_SLCT which has 3 sources. expr() has been 
called, and that SLCT has been optimized to MOV because src0 and src1 
were 0. Thus, srcExists(1) will return false while srcExists(2) will 
return true because it has not been erased. right?

The thing is, after the opnd3() call in the visit func, I added some 
logic in order to detect if src0/src2 or src1/src2 are immediates.

Have a look at this chunk http://hastebin.com/ihalexobus.cpp .

>>
>>
>>>
>>>>
>>>> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
>>>> ---
>>>>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 +
>>>>  1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>> index 9cf6ddc..ef31436 100644
>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>> @@ -772,6 +772,7 @@ ConstantFolding::expr(Instruction *i,
>>>>        break;
>>>>     default:
>>>>        i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary()
>>>> */
>>>> +      i->setSrc(2, NULL);
>>>>        break;
>>>>     }
>>>>     i->subOp = 0;
>>>> --
>>>> 2.10.2
>>>>
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>>
>> --
>> -Samuel

-- 
-Samuel


More information about the mesa-dev mailing list