[Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

Francisco Jerez currojerez at riseup.net
Mon Feb 16 09:34:00 PST 2015


Jason Ekstrand <jason at jlekstrand.net> writes:

> On Feb 16, 2015 8:35 AM, "Francisco Jerez" <currojerez at riseup.net> wrote:
>>
>> The round-robin allocation strategy is expected to decrease the amount
>> of false dependencies created by the register allocator and give the
>> post-RA scheduling pass more freedom to move instructions around.  On
>> the other hand it has the disadvantage of increasing fragmentation and
>> decreasing the number of equally-colored nearby nodes, what increases
>> the likelihood of failure in presence of optimistically colorable
>> nodes.
>>
>> This patch disables the round-robin strategy for optimistically
>> colorable nodes.  These typically arise in situations of high register
>> pressure or for registers with large live intervals, in both cases the
>> task of the instruction scheduler shouldn't be constrained excessively
>> by the dense packing of those nodes, and a spill (or on Intel hardware
>> a fall-back to SIMD8 mode) is invariably worse than a slightly less
>> optimal scheduling.
>
> Actually, that's not true.  Matt was doing some experiments recently with a
> noise shader from synmark and the difference between our 2nd and 3rd choice
> schedulers is huge.  In that test he disabled the third choice scheduler
> and the result was a shader that spilled 6 or 8 times but ran something
> like 30% faster.  We really need to do some more experimentation with
> scheduling and figure out better heuristics than "SIMD16 is always faster"
> and "spilling is bad".
>

Yes, I'm aware of rare corner cases like that where e.g. SIMD16 leads to
higher cache thrashing than SIMD8 leading to decreased overall
performance, and a case where a shader SIMD16 *with* spills has better
performance than the SIMD8 version of the same shader without spills.

In any case it's not the register allocator's business to implement such
heuristics, and that's not an argument against the register allocator
trying to make a more efficient use of the register file.

>> Shader-db results on the i965 driver:
>>
>> total instructions in shared programs: 5488539 -> 5488489 (-0.00%)
>> instructions in affected programs:     1121 -> 1071 (-4.46%)
>> helped:                                1
>> HURT:                                  0
>> GAINED:                                49
>> LOST:                                  5
>> ---
>>  src/util/register_allocate.c | 22 +++++++++++++++++++++-
>>  1 file changed, 21 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/util/register_allocate.c b/src/util/register_allocate.c
>> index af7a20c..d63d8eb 100644
>> --- a/src/util/register_allocate.c
>> +++ b/src/util/register_allocate.c
>> @@ -168,6 +168,12 @@ struct ra_graph {
>>
>>     unsigned int *stack;
>>     unsigned int stack_count;
>> +
>> +   /**
>> +    * Tracks the start of the set of optimistically-colored registers in
> the
>> +    * stack.
>> +    */
>> +   unsigned int stack_optimistic_start;
>>  };
>>
>>  /**
>> @@ -454,6 +460,7 @@ static void
>>  ra_simplify(struct ra_graph *g)
>>  {
>>     bool progress = true;
>> +   unsigned int stack_optimistic_start = ~0;
>>     int i;
>>
>>     while (progress) {
>> @@ -483,12 +490,16 @@ ra_simplify(struct ra_graph *g)
>>
>>        if (!progress && best_optimistic_node != ~0U) {
>>          decrement_q(g, best_optimistic_node);
>> +         stack_optimistic_start =
>> +            MIN2(stack_optimistic_start, g->stack_count);
>>          g->stack[g->stack_count] = best_optimistic_node;
>>          g->stack_count++;
>>          g->nodes[best_optimistic_node].in_stack = true;
>>          progress = true;
>>        }
>>     }
>> +
>> +   g->stack_optimistic_start = stack_optimistic_start;
>>  }
>>
>>  /**
>> @@ -542,7 +553,16 @@ ra_select(struct ra_graph *g)
>>        g->nodes[n].reg = r;
>>        g->stack_count--;
>>
>> -      if (g->regs->round_robin)
>> +      /* Rotate the starting point except for optimistically colorable
> nodes.
>> +       * The likelihood that we will succeed at allocating optimistically
>> +       * colorable nodes is highly dependent on the way that the previous
>> +       * nodes popped off the stack are laid out.  The round-robin
> strategy
>> +       * increases the fragmentation of the register file and decreases
> the
>> +       * number of nearby nodes assigned to the same color, what
> increases the
>> +       * likelihood of spilling with respect to the dense packing
> strategy.
>> +       */
>> +      if (g->regs->round_robin &&
>> +          g->stack_count <= g->stack_optimistic_start)
>>           start_search_reg = r + 1;
>>     }
>>
>> --
>> 2.1.3
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150216/d85a4d0b/attachment.sig>


More information about the mesa-dev mailing list