[Mesa-dev] [PATCHv2] ra: Disable round-robin strategy for optimistically colorable nodes.
Tom Stellard
tom at stellard.net
Tue Feb 17 10:24:40 PST 2015
On Tue, Feb 17, 2015 at 04:41:41PM +0200, Francisco Jerez wrote:
> Tom Stellard <tom at stellard.net> writes:
>
> > On Tue, Feb 17, 2015 at 03:23:05PM +0200, Francisco Jerez wrote:
> >> The round-robin allocation strategy is expected to decrease the amount
> >> of false dependencies created by the register allocator and give the
> >> post-RA scheduling pass more freedom to move instructions around. On
> >> the other hand it has the disadvantage of increasing fragmentation and
> >> decreasing the number of equally-colored nearby nodes, what increases
> >> the likelihood of failure in presence of optimistically colorable
> >> nodes.
> >>
> >> This patch disables the round-robin strategy for optimistically
> >> colorable nodes. These typically arise in situations of high register
> >> pressure or for registers with large live intervals, in both cases the
> >> task of the instruction scheduler shouldn't be constrained excessively
> >> by the dense packing of those nodes, and a spill (or on Intel hardware
> >> a fall-back to SIMD8 mode) is invariably worse than a slightly less
> >> optimal scheduling.
> >>
> >
> Hi Tom,
>
> > I'm trying to figure out how this will affect r300g, and it seems like
> > from your description that it will be an improvement, because r300g
> > doesn't have a post-ra scheduler and it also can't spill registers.
> >
> > What do you think?
> >
>
> It looks like it won't, apparently i965 is the only caller of
> ra_set_allocate_round_robin() in the tree right now, so it should be the
> only affected back-end. You could consider enabling it to reduce the
> number false dependencies introduced by the register allocator -- after
> this patch it shouldn't lead to increased likelihood of register
> allocation failure anymore. It might however lead to increased register
> usage possibly limiting the number of threads your hardware can run in
> parallel, the answer really depends on whether that's a limiting factor
> for your hardware or not. I guess that if you don't have a post-RA
> scheduling pass the benefit you could possibly get from it is rather
> limited, it's probably safe to assume that you don't need it but it
> might be worth looking into.
>
Ok, thanks for the explanation. I probably won't have time to
investigate, but it's good knowing this is patch is a no-op for
r300g so I don't need to worry about regressions.
-Tom
> > -Tom
> >
> >
> >> Shader-db results on the i965 driver:
> >>
> >> total instructions in shared programs: 5488539 -> 5488489 (-0.00%)
> >> instructions in affected programs: 1121 -> 1071 (-4.46%)
> >> helped: 1
> >> HURT: 0
> >> GAINED: 49
> >> LOST: 5
> >>
> >> v2: Re-enable round-robin already for the lowest one of the nodes
> >> pushed optimistically onto the sack (Connor).
> >> ---
> >> src/util/register_allocate.c | 23 ++++++++++++++++++++++-
> >> 1 file changed, 22 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/src/util/register_allocate.c b/src/util/register_allocate.c
> >> index af7a20c..b1ed273 100644
> >> --- a/src/util/register_allocate.c
> >> +++ b/src/util/register_allocate.c
> >> @@ -168,6 +168,12 @@ struct ra_graph {
> >>
> >> unsigned int *stack;
> >> unsigned int stack_count;
> >> +
> >> + /**
> >> + * Tracks the start of the set of optimistically-colored registers in the
> >> + * stack.
> >> + */
> >> + unsigned int stack_optimistic_start;
> >> };
> >>
> >> /**
> >> @@ -454,6 +460,7 @@ static void
> >> ra_simplify(struct ra_graph *g)
> >> {
> >> bool progress = true;
> >> + unsigned int stack_optimistic_start = ~0;
> >> int i;
> >>
> >> while (progress) {
> >> @@ -483,12 +490,16 @@ ra_simplify(struct ra_graph *g)
> >>
> >> if (!progress && best_optimistic_node != ~0U) {
> >> decrement_q(g, best_optimistic_node);
> >> + stack_optimistic_start =
> >> + MIN2(stack_optimistic_start, g->stack_count);
> >> g->stack[g->stack_count] = best_optimistic_node;
> >> g->stack_count++;
> >> g->nodes[best_optimistic_node].in_stack = true;
> >> progress = true;
> >> }
> >> }
> >> +
> >> + g->stack_optimistic_start = stack_optimistic_start;
> >> }
> >>
> >> /**
> >> @@ -542,7 +553,17 @@ ra_select(struct ra_graph *g)
> >> g->nodes[n].reg = r;
> >> g->stack_count--;
> >>
> >> - if (g->regs->round_robin)
> >> + /* Rotate the starting point except for any nodes above the lowest
> >> + * optimistically colorable node. The likelihood that we will succeed
> >> + * at allocating optimistically colorable nodes is highly dependent on
> >> + * the way that the previous nodes popped off the stack are laid out.
> >> + * The round-robin strategy increases the fragmentation of the register
> >> + * file and decreases the number of nearby nodes assigned to the same
> >> + * color, what increases the likelihood of spilling with respect to the
> >> + * dense packing strategy.
> >> + */
> >> + if (g->regs->round_robin &&
> >> + g->stack_count <= g->stack_optimistic_start + 1)
> >> start_search_reg = r + 1;
> >> }
> >>
> >> --
> >> 2.1.3
> >>
> >> _______________________________________________
> >> mesa-dev mailing list
> >> mesa-dev at lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list