[Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

Connor Abbott cwabbott0 at gmail.com
Wed Feb 18 10:09:03 PST 2015


On Wed, Feb 18, 2015 at 8:31 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> Connor Abbott <cwabbott0 at gmail.com> writes:
>
>> On Tue, Feb 17, 2015 at 3:04 PM, Francisco Jerez <currojerez at riseup.net> wrote:
>>> Jason Ekstrand <jason at jlekstrand.net> writes:
>>>
>>>> On Mon, Feb 16, 2015 at 11:39 AM, Francisco Jerez <currojerez at riseup.net>
>>>> wrote:
>>>>
>>>>> The round-robin allocation strategy is expected to decrease the amount
>>>>> of false dependencies created by the register allocator and give the
>>>>> post-RA scheduling pass more freedom to move instructions around.  On
>>>>> the other hand it has the disadvantage of increasing fragmentation and
>>>>> decreasing the number of equally-colored nearby nodes, what increases
>>>>> the likelihood of failure in presence of optimistically colorable
>>>>> nodes.
>>>>>
>>>>> This patch disables the round-robin strategy for optimistically
>>>>> colorable nodes.  These typically arise in situations of high register
>>>>> pressure or for registers with large live intervals, in both cases the
>>>>> task of the instruction scheduler shouldn't be constrained excessively
>>>>> by the dense packing of those nodes, and a spill (or on Intel hardware
>>>>> a fall-back to SIMD8 mode) is invariably worse than a slightly less
>>>>> optimal scheduling.
>>>>>
>>>>> Shader-db results on the i965 driver:
>>>>>
>>>>> total instructions in shared programs: 5488539 -> 5488489 (-0.00%)
>>>>> instructions in affected programs:     1121 -> 1071 (-4.46%)
>>>>> helped:                                1
>>>>> HURT:                                  0
>>>>> GAINED:                                49
>>>>> LOST:                                  5
>>>>> ---
>>>>>  src/util/register_allocate.c | 22 +++++++++++++++++++++-
>>>>>  1 file changed, 21 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/src/util/register_allocate.c b/src/util/register_allocate.c
>>>>> index af7a20c..d63d8eb 100644
>>>>> --- a/src/util/register_allocate.c
>>>>> +++ b/src/util/register_allocate.c
>>>>> @@ -168,6 +168,12 @@ struct ra_graph {
>>>>>
>>>>>     unsigned int *stack;
>>>>>     unsigned int stack_count;
>>>>> +
>>>>> +   /**
>>>>> +    * Tracks the start of the set of optimistically-colored registers in
>>>>> the
>>>>> +    * stack.
>>>>> +    */
>>>>> +   unsigned int stack_optimistic_start;
>>>>>  };
>>>>>
>>>>>  /**
>>>>> @@ -454,6 +460,7 @@ static void
>>>>>  ra_simplify(struct ra_graph *g)
>>>>>  {
>>>>>     bool progress = true;
>>>>> +   unsigned int stack_optimistic_start = ~0;
>>>>>
>>>>
>>>> UINT_MAX would be clearer than ~0
>>>>
>>> Sure, either works as identity for MIN2.  Do you want me to send a v3
>>> for this or should I fix it locally and add your R-b?
>>>
>>>>
>>>>>     int i;
>>>>>
>>>>>     while (progress) {
>>>>> @@ -483,12 +490,16 @@ ra_simplify(struct ra_graph *g)
>>>>>
>>>>>        if (!progress && best_optimistic_node != ~0U) {
>>>>>
>>>>
>>>> I guess we're using ~0 other places... oh well...
>>>>
>>>>
>>>>>          decrement_q(g, best_optimistic_node);
>>>>> +         stack_optimistic_start =
>>>>> +            MIN2(stack_optimistic_start, g->stack_count);
>>>>>
>>>>
>>>> It might be clearer to explicitly use an if here instead of the MIN2
>>>> because what this really means is "if (stack_optimistic_start == ~0)
>>>> stack_optimistic_start = g->stack_count;"
>>>>
>>> What I really meant to calculate here is the minimum of the indices of
>>> all optimistic nodes inserted into the stack.  What you say works too
>>> because we happen to be growing the stack monotonically.  How can using
>>> MIN2 possibly obscure the meaning of taking the minimum?
>>
>> Because you're finding stack_optimistic_*start*, i.e. the *first*
>> thing on the stack that's optimistic. It's a pretty common idiom in C,
>> when you're going through a bunch of stuff and you want to record the
>> first thing that satisfies some property, to do:
>>
>> result = some_default_value (NULL, UINT_MAX, etc.)
>> for_each_thing {
>>     ...
>>     if (has_the_property(thing) && result == some_default_value) {
>>         result = thing;
>>     }
>> }
>>
>> so if you do what Jason suggested, people will see the pattern and go
>> "ah, it's recording the first thing we pushed optimistically, that
>> makes sense" as opposed to thinking about what the minimum is doing;
>> it's not obvious at first that after the first MIN2
>> stack_optimistic_start is never going to change.
>>
> Yeah, I completely agree with your argumentation, actually it's funny
> that this was precisely the reason that led me to intentionally write
> MIN2() rather than your open-coded equivalent.  I consider the way how
> something is going to change or not along your incidental control flow a
> distraction from its formal definition.  But I'm not going spend a
> single second more of my time in this discussion because it is a matter
> of personal taste and there's no objectively better or worse choice,
> there's no point in being categorical here.  If you insist in stressing
> the how over the what and cannot tolerate somebody else's personal
> inclination I'm just going to do as you say because this has long
> crossed the line of bike-shedding.
>
> Did you actually intend to review this patch?  And if so do you want a
> resend with the trivial codestyle changes?

Sorry if I came off as too argumentative there -- it's really not a
big deal, and I've spent too much time arguing about it. Feel free to
change the codestyle or not, but you can put my r-b on it.

>
>>>
>>>> Other than that (and connor's comment), it looks fine to me.
>>>>
>>>> --Jason
>>>>
>>>>
>>>>>          g->stack[g->stack_count] = best_optimistic_node;
>>>>>          g->stack_count++;
>>>>>          g->nodes[best_optimistic_node].in_stack = true;
>>>>>          progress = true;
>>>>>        }
>>>>>     }
>>>>> +
>>>>> +   g->stack_optimistic_start = stack_optimistic_start;
>>>>>  }
>>>>>
>>>>>  /**
>>>>> @@ -542,7 +553,16 @@ ra_select(struct ra_graph *g)
>>>>>        g->nodes[n].reg = r;
>>>>>        g->stack_count--;
>>>>>
>>>>> -      if (g->regs->round_robin)
>>>>> +      /* Rotate the starting point except for optimistically colorable
>>>>> nodes.
>>>>> +       * The likelihood that we will succeed at allocating optimistically
>>>>> +       * colorable nodes is highly dependent on the way that the previous
>>>>> +       * nodes popped off the stack are laid out.  The round-robin
>>>>> strategy
>>>>> +       * increases the fragmentation of the register file and decreases
>>>>> the
>>>>> +       * number of nearby nodes assigned to the same color, what
>>>>> increases the
>>>>> +       * likelihood of spilling with respect to the dense packing
>>>>> strategy.
>>>>> +       */
>>>>> +      if (g->regs->round_robin &&
>>>>> +          g->stack_count <= g->stack_optimistic_start)
>>>>>           start_search_reg = r + 1;
>>>>>     }
>>>>>
>>>>> --
>>>>> 2.1.3
>>>>>
>>>>> _______________________________________________
>>>>> mesa-dev mailing list
>>>>> mesa-dev at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>>
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>


More information about the mesa-dev mailing list