[Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

Francisco Jerez currojerez at riseup.net
Tue Feb 17 05:15:32 PST 2015

Connor Abbott <cwabbott0 at gmail.com> writes:

> Hi Francisco,
Hi Connor, and thank you for your feedback.

> A few comments:
> 1) This is just a difference in definitions, but to me an
> optimistically-colored node is a node that we pushed onto the stack
> without knowing whether we could color it or not. There may be (and
> most certainly are) nodes above the optimistically-colored node on the
> stack that we know we can color, so I wouldn't call those
> optimistically colored even though they probably have the
> characteristics you mentioned. 

Yeah, I mostly agree on the definition, but the distinction is somewhat
arbitrary and consequence of the min-q heuristic, which picks a
preferred node among the whole set of remaining non-trivially colorable
nodes, but that's by no means the only possible choice.  We may indeed
know we can trivially color some of the nodes left after we push the
first optimistically node onto the stack, but even that's dependent on
the assumption that we can take the optimistically colored node below
them out of the picture, and that's a bit of an overstatement because we
have no guarantee that the coloring we choose for these nodes won't lead
to a contradiction later on.

> The distinction is useful since the optimistically colored nodes are
> those which we may have to spill.
Now you're using "optimistically colored nodes" to refer to the nodes we
potentially consider for spilling (at most the first optimistically
colorable node and everything above it), which is precisely the same set
of nodes I'm disabling round-robin for. ;)

> 2) Given the above, what you're doing here is disabling the
> round-robin strategy for the nodes above the lowest
> optimistically-colored node as well as the lowest
> optimistically-colored node itself. But you don't need to disable
> round-robin for the lowest optimistically-colored node itself, since
> once we're able to find a register (any register!) for that one, we
> know we'll be able to find a register for everything else that's left,
> so it doesn't really matter which register we start searching at.
Heh, fair enough, I'll send a v2.

> 3) Usually, there are a series of registers that are pushed onto the
> stack optimistically until we can go back to the normal strategy, and
> it's one of the registers in the middle of the stack we fail to place,
> so we don't need to disable round-robin for *all* the optimistically
> colored nodes, only everything above the node that would've failed to
> allocate. Another way to think of it is that when I changed the
> handling of optimistic coloring a while ago, we gained ~150 SIMD16
> shaders. So of the >200 SIMD16 shaders that we optimistically color
> but still successfully allocate after your patch, only 50 -- less than
> a quarter -- actually need to have round-robin disabled. This is just
> an idea, but maybe we could handle this similarly to how we handle
> spilling: first, try ra_select() using round-robin all the way, and
> only when it fails do we disable round-robin for everything above the
> thing that we failed to allocate and try ra_select() again. We would
> keep doing this until disabling round-robin wouldn't make a
> difference.

We already discussed this shortly on IRC, but let me summarize here.

I did consider doing that to avoid the regressions (and actually went as
far as to implement it), but it seemed rather ugly to add another loop
of trial-and-error to the already ugly schedule-regalloc loop which
tries out three different scheduling heuristics for two different
dispatch widths until one succeeds, and then you have one more iteration
for each spill.

I admit that the additional back-tracking would be unlikely to affect
the running time of the register allocator significantly, but I suspect
that the pre-regalloc scheduler is also at fault in the regressing cases
and there's likely to be a better solution for them.  I don't have any
evidence either that the additional back-tracking will lead to any
measurable improvement in any of the other cases (actually I would be
very surprised if that's the case), so this suggestion seems a bit of a
premature optimization to me, how about we KISS for now.

> On Mon, Feb 16, 2015 at 11:39 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>> The round-robin allocation strategy is expected to decrease the amount
>> of false dependencies created by the register allocator and give the
>> post-RA scheduling pass more freedom to move instructions around.  On
>> the other hand it has the disadvantage of increasing fragmentation and
>> decreasing the number of equally-colored nearby nodes, what increases
>> the likelihood of failure in presence of optimistically colorable
>> nodes.
>> This patch disables the round-robin strategy for optimistically
>> colorable nodes.  These typically arise in situations of high register
>> pressure or for registers with large live intervals, in both cases the
>> task of the instruction scheduler shouldn't be constrained excessively
>> by the dense packing of those nodes, and a spill (or on Intel hardware
>> a fall-back to SIMD8 mode) is invariably worse than a slightly less
>> optimal scheduling.
>> Shader-db results on the i965 driver:
>> total instructions in shared programs: 5488539 -> 5488489 (-0.00%)
>> instructions in affected programs:     1121 -> 1071 (-4.46%)
>> helped:                                1
>> HURT:                                  0
>> GAINED:                                49
>> LOST:                                  5
>> ---
>>  src/util/register_allocate.c | 22 +++++++++++++++++++++-
>>  1 file changed, 21 insertions(+), 1 deletion(-)
>> diff --git a/src/util/register_allocate.c b/src/util/register_allocate.c
>> index af7a20c..d63d8eb 100644
>> --- a/src/util/register_allocate.c
>> +++ b/src/util/register_allocate.c
>> @@ -168,6 +168,12 @@ struct ra_graph {
>>     unsigned int *stack;
>>     unsigned int stack_count;
>> +
>> +   /**
>> +    * Tracks the start of the set of optimistically-colored registers in the
>> +    * stack.
>> +    */
>> +   unsigned int stack_optimistic_start;
>>  };
>>  /**
>> @@ -454,6 +460,7 @@ static void
>>  ra_simplify(struct ra_graph *g)
>>  {
>>     bool progress = true;
>> +   unsigned int stack_optimistic_start = ~0;
>>     int i;
>>     while (progress) {
>> @@ -483,12 +490,16 @@ ra_simplify(struct ra_graph *g)
>>        if (!progress && best_optimistic_node != ~0U) {
>>          decrement_q(g, best_optimistic_node);
>> +         stack_optimistic_start =
>> +            MIN2(stack_optimistic_start, g->stack_count);
>>          g->stack[g->stack_count] = best_optimistic_node;
>>          g->stack_count++;
>>          g->nodes[best_optimistic_node].in_stack = true;
>>          progress = true;
>>        }
>>     }
>> +
>> +   g->stack_optimistic_start = stack_optimistic_start;
>>  }
>>  /**
>> @@ -542,7 +553,16 @@ ra_select(struct ra_graph *g)
>>        g->nodes[n].reg = r;
>>        g->stack_count--;
>> -      if (g->regs->round_robin)
>> +      /* Rotate the starting point except for optimistically colorable nodes.
>> +       * The likelihood that we will succeed at allocating optimistically
>> +       * colorable nodes is highly dependent on the way that the previous
>> +       * nodes popped off the stack are laid out.  The round-robin strategy
>> +       * increases the fragmentation of the register file and decreases the
>> +       * number of nearby nodes assigned to the same color, what increases the
>> +       * likelihood of spilling with respect to the dense packing strategy.
>> +       */
>> +      if (g->regs->round_robin &&
>> +          g->stack_count <= g->stack_optimistic_start)
>>           start_search_reg = r + 1;
>>     }
>> --
>> 2.1.3
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150217/c84c6775/attachment-0001.sig>

More information about the mesa-dev mailing list