[Mesa-dev] Possible bug in nir_algebraic?

Ian Romanick idr at freedesktop.org
Mon Jun 24 18:33:59 UTC 2019

On 6/22/19 5:10 AM, Connor Abbott wrote:
> I haven't thought about whether it's algebraically correct, but
> otherwise your pattern looks fine to me.

I'll have a proof when I send out the patch. :)

> If you haven't noticed already, I added some commented out code to
> nir_replace_instr() that will print out each pattern that's matched.
> The first thing I'd do is move that up to the beginning of the
> function, so that it prints potential matches instead of actual
> matches. If your pattern shows up as a potential match, then it's a
> problem with match_expression() and not the automaton, and at that
> point you can start setting breakpoints and stepping through it.

It looks like the automaton is matching, but match_expression is not.
I'll dig deeper.  Thanks for the tips.

> If it's not a potential match because the automaton filtered it out,
> then debugging is currently a little harder. You'll have to add some
> debugging code to ${pass_name}_pre_block() that prints out the
> assigned state for each ALU instruction. The state is an integer which
> is conceptually an index into TreeAutomaton.states for the constructed
> TreeAutomaton. So if an instruction has state n, then
> automaton.states[n] should be a set of potential partial matches for
> that instruction. Note that variables like a, b, c etc. are converted
> into __wildcard while constants are all converted into __constant. So
> for example, ssa_2 should have a set { __const, __wildcard } as it can
> be matched as a variable or as a constant (we actually explicitly
> construct this set in TreeAutomaton._build_table). ssa_83 should have
> a set { __wildcard, (neg __wildcard) } since it can be matched either
> as a variable or as something like (neg a). (yes, this is very similar
> to the subset construction for getting a DFA from an NFA...). Unless
> there's a bug, each of these "match sets" should contain the
> appropriate subset of your pattern until ssa_97 which should have the
> full pattern as one of the entries in the set. Let us know the details
> if that's not the case.
> I think that we could definitely do better when it comes to debugging
> why the automaton didn't match something. We could emit the
> automaton's state list in C, and then have a debugging option to print
> the match set for each instruction so you'd know where something went
> awry. I didn't do that earlier since I didn't have a need for it while
> bringing up the automaton, but we could add it if it helps. That being
> said, hopefully you won't need it this time :)
> Best,
> Connor
> On Sat, Jun 22, 2019 at 2:26 AM Ian Romanick <idr at freedesktop.org> wrote:
>> I have encountered what I believe to be a bug in nir_algebraic.  Since
>> the rewrite to use automata, I'm not sure how to begin debugging it.
>> I'm looking for some suggestions... even if the suggestion is, "Fix your
>> patterns."
>> I have added a pattern like:
>>    (('~fadd at 32', ('fmul', ('fadd', 1.0, ('fneg', a)),
>>                           ('fadd', 1.0, ('fneg', a))),
>>                  ('fmul', ('flrp', a, 1.0, a), b)),
>>     ('flrp', 1.0, b, a), '!options->lower_flrp32'),
>> While using NIR_PRINT=1, I see this in my instruction stream:
>>         vec1 32 ssa_2 = load_const (0x3f800000 /* 1.000000 */)
>>         ...
>>         vec1 32 ssa_196 = intrinsic load_uniform (ssa_195) (68, 4, 160)
>>         vec1 32 ssa_83 = fneg ssa_196
>>         vec1 32 ssa_84 = fadd ssa_83, ssa_2
>>         vec1 32 ssa_85 = fmul ssa_84, ssa_84
>>         ...
>>         vec1 32 ssa_95 = flrp ssa_196, ssa_2, ssa_196
>>         vec1 32 ssa_96 = fmul ssa_78, ssa_95
>>         vec1 32 ssa_97 = fadd ssa_96, ssa_85
>> But nir_opt_algebraic does not make any progress.  It sure looks like it
>> should trigger with a = ssa_196 and b = ssa_78.
>> However, progress is made if I change the pattern to
>>    (('~fadd at 32', ('fmul', ('fadd', 1.0, ('fneg', a)),
>>                           c),
>>                  ('fmul', ('flrp', a, 1.0, a), b)),
>>     ('flrp', 1.0, b, a), '!options->lower_flrp32'),
>> ssa_85 is definitely ('fmul', ssa_84, ssa_84), and ssa_84 is definitely
>> ('fadd', 1.0, ('fneg', ssa_196))... both times. :)

More information about the mesa-dev mailing list