[Mesa-dev] [PATCH] nv50/ir: make sure to erase src2 after optimizing to MOV
Ilia Mirkin
imirkin at alum.mit.edu
Thu Nov 10 01:32:04 UTC 2016
On Wed, Nov 9, 2016 at 10:35 AM, Samuel Pitoiset
<samuel.pitoiset at gmail.com> wrote:
>
>
> On 11/09/2016 04:19 PM, Ilia Mirkin wrote:
>>
>> On Wed, Nov 9, 2016 at 10:10 AM, Samuel Pitoiset
>> <samuel.pitoiset at gmail.com> wrote:
>>>
>>>
>>>
>>> On 11/09/2016 03:58 PM, Ilia Mirkin wrote:
>>>>
>>>>
>>>> On Wed, Nov 9, 2016 at 9:20 AM, Samuel Pitoiset
>>>> <samuel.pitoiset at gmail.com> wrote:
>>>>>
>>>>>
>>>>> For all instructions with 3 sources (like OP_SLCT), src2 needs
>>>>> to be destroyed because srcExists(2) will return true although
>>>>> it's actually undefined.
>>>>>
>>>>> Spotted with my ADD3 series.
>>>>
>>>>
>>>>
>>>> Sounds like the ADD3 series' addition to this function doesn't handle
>>>> this scenario. It's already properly handled by FMA/MAD, for example.
>>>
>>>
>>>
>>> Nope, it's correctly handled in expr() in my series.
>>>
>>> In ConstantFolding::visit(BasicBlock *bb), I added a new opnd2() method
>>> which is used for folding ADD3(a,b,c)-> ADD3(b,a+c) or ->ADD3(a,b+c)
>>> because
>>> the pass doesn't currently implement such a case.
>>>
>>> Though, opnd2() is called if srcExists(2) is true and if src0/src1 or
>>> src0/src2 are immediate values. But as this is done after the opnd3()
>>> call,
>>> if the instruction has been constant folded (just before) but src2 not
>>> erased, it will try to get the immediates and crash... which is expected.
>>>
>>> This just makes expr() more robust against that, and it's not entirely
>>> related to my series, because if you try to detect more scenarios where
>>> immediates can be folded *after* opnd3() you will hit the issue for sure.
>>>
>>> src1 is already erased in expr(), I don't see why src2 should not be as
>>> well
>>> (when optimizing to SAT/MOV).
>>
>>
>> src2 is also erased.
>>
>> switch (i->op) {
>> case OP_MAD:
>> case OP_FMA: {
>> ...
>> i->setSrc(2, NULL);
>> }
>>
>> You need to add a case for ADD3 here that does the same thing. And
>> also immediate re-executes expr/opnd in case the other arg was *also*
>> an immed. That should avoid taking up an extra fold iteration on it.
>>
>
> I know, I already added all the dance there. Let me explain again. :-)
>
> I found the crash with OP_SLCT which has 3 sources. expr() has been called,
> and that SLCT has been optimized to MOV because src0 and src1 were 0. Thus,
> srcExists(1) will return false while srcExists(2) will return true because
> it has not been erased. right?
>
> The thing is, after the opnd3() call in the visit func, I added some logic
> in order to detect if src0/src2 or src1/src2 are immediates.
Then it's a bug in the SLCT handling.
Look - setting src(2) to null like that is a bug. It might be a
predicate source (for ->setPredicate), or a couple other things. If
you're changing ops from one to another you should be using
moveSources which will adjust everything as well. Although changing
from SLCT to a MOV should be impossible -- IIRC SLCT is a
CmpInstruction while MOV should be a regular one.
-ilia
>
> Have a look at this chunk http://hastebin.com/ihalexobus.cpp .
>
>
>>>
>>>
>>>>
>>>>>
>>>>> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
>>>>> ---
>>>>> src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 +
>>>>> 1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>> index 9cf6ddc..ef31436 100644
>>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>>>> @@ -772,6 +772,7 @@ ConstantFolding::expr(Instruction *i,
>>>>> break;
>>>>> default:
>>>>> i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary()
>>>>> */
>>>>> + i->setSrc(2, NULL);
>>>>> break;
>>>>> }
>>>>> i->subOp = 0;
>>>>> --
>>>>> 2.10.2
>>>>>
>>>>> _______________________________________________
>>>>> mesa-dev mailing list
>>>>> mesa-dev at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>>>
>>>
>>> --
>>> -Samuel
>
>
> --
> -Samuel
More information about the mesa-dev
mailing list