[Mesa-dev] [PATCH] glsl: Make copy propagation not panic when it sees an intrinsic.

Timothy Arceri timothy.arceri at collabora.com
Mon Dec 12 05:28:13 UTC 2016


On Mon, 2016-12-12 at 11:35 +1100, Timothy Arceri wrote:
> On Sun, 2016-12-11 at 00:00 -0800, Kenneth Graunke wrote:
> > 
> > On Saturday, December 10, 2016 12:37:16 PM PST Matt Turner wrote:
> > > 
> > > 
> > > On Fri, Dec 9, 2016 at 8:28 PM, Kenneth Graunke <kenneth at whitecap
> > > e.
> > > org> wrote:
> > > > 
> > > > 
> > > > A number of games have large arrays of constants, which we
> > > > promote to
> > > > uniforms.  This introduces copies from the uniform array to the
> > > > original
> > > > temporary array.  Normally, copy propagation eliminates those
> > > > copies,
> > > > making everything refer to the uniform array directly.
> > > > 
> > > > A number of shaders in "Deus Ex: Mankind Divided" recently
> > > > exposed a
> > > > limitation of copy propagation - if we had any intrinsics (i.e.
> > > > image
> > > > access in a compute shader), we weren't able to get rid of
> > > > these
> > > > copies.
> > > > 
> > > > That meant that any variable indexing remained on the temporary
> > > > array
> > > > rather being moved to the uniform array.  i965's scalar backend
> > > > currently doesn't support indirect addressing of temporary
> > > > arrays,
> > > > which meant lowering it to if-ladders.  This was horrible.
> > > > 
> > > > On Skylake:
> > > > 
> > > > total instructions in shared programs: 13700090 -> 13654519 (-
> > > > 0.33%)
> > > > instructions in affected programs: 56438 -> 10867 (-80.75%)
> > > 
> > > Wow!
> > > 
> > > > 
> > > > 
> > > > helped: 14
> > > > HURT: 0
> > > > 
> > > > total cycles in shared programs: 288879704 -> 291270232 (0.83%)
> > > > cycles in affected programs: 12758080 -> 15148608 (18.74%)
> > > 
> > > ... that seems nuts?
> > > 
> > > Any idea what's going on with the cycle counts?
> > 
> > Good point...I glossed over the cycle counts when I saw the -80%
> > reduction in instructions with 0 shaders hurt.  But they do look
> > pretty bad, so let's take a closer look...
> > 
> > There are two nearly identical shaders that are the worst
> > offenders:
> > 
> > shaders/closed/steam/deus-ex-mankind-divided/256.shader_test CS
> > SIMD16:
> > 
> >     instructions: 2770 -> 253 (-2,517 instructions or -90.87%)
> >     spills:         25 -> 0
> >     fills:          29 -> 0
> >     cycles:     923266 -> 1420534 (+497,268 cycles or +53.86%)
> >     compile time: 2.73 seconds -> 0.17 seconds
> > 
> > There are three loops in the program, each of which contains two
> > indirect reads of the uvec4[98] constant array.
> > 
> > Before this patch, there were:
> >  - 67 UNIFORM_PULL_CONSTANT_LOADs at the top of the program
> >  - 1 UNIFORM_PULL_CONSTANT_LOAD in the first (cheap) loop
> >  - 1 UNIFORM_PULL_CONSTANT_LOAD in the second (expensive) loop
> >  - 1 UNIFORM_PULL_CONSTANT_LOAD in the third (very expensive) loop
> > 
> > After this patch, there are:
> >  - 0 loads at the top of the program
> >  - 1 VARYING_PULL_CONSTANT_LOAD in the first (cheap) loop
> >  - 2 VARYING_PULL_CONSTANT_LOAD in the second (expensive) loop
> >  - 2 VARYING_PULL_CONSTANT_LOAD in the third (very expensive) loop
> > 
> > The array indexes in the expensive loop are a[foo] and a[foo + 1].
> > foo is modified in the loop, so they can't be hoisted out.  I don't
> > think we can determine the number of loop iterations.
> 
> The first loop has a trip count of 49. If I force it to unroll I get:
> 
> CS SIMD16 shader: 682 inst, 2 loops, 1009856 cycles.
> 
> The current unrolling rules are:
> 
> static bool
> is_loop_small_enough_to_unroll(nir_shader *shader, nir_loop_info *li)
> {
>    unsigned max_iter = shader->options->max_unroll_iterations;
> 
>    if (li->trip_count > max_iter)
>       return false;
> 
>    if (li->force_unroll) // this is for consecutive indirect array
> access
>       return true;
> 
>    bool loop_not_too_large =
>       li->num_instructions * li->trip_count <= max_iter * 25;
> 
>    return loop_not_too_large;
> }
> 
> 
> Maybe we should just drop the first trip count check? Checking the
> number of intructions * trip count makes a lot more sense. Dropping
> the
> check allows the first loop to unroll.

I tried to get this working for GLSL IR loop unrolling and fixed a
couple of the other problems, panicing at intrinsics (partialy fixed
-well I just ignore them for now), and nequal condition. 

But I also ran into futher issues:

1) Not being able to detect an induction variable because we end up
with: 

(assign  (x) (var_ref compiler_temp at 19)  (expression uint bitcast_f2u
(swiz y (var_ref r1) )) ) 
                (if (expression bool != (var_ref compiler_temp at 19)
(constant uint (0)) ) (
                  break
                )

2) The GLSL IR analysis pass bails if the first if in the loop is not a
loop terminator.

It would have been nice to fix the GLSL IR pass so I could compare the
NIR pass to the improved version and try to avoid regressing shaders
like this but I'm stating to think it's not worth all the effort. What
do you think?

[1] https://github.com/tarceri/Mesa/compare/unroll_more

Anyway this patch looks correct to me so:

Reviewed-by: Timothy Arceri <timothy.arceri at collabora.com>

> 
> > 
> > 
> > The two expensive loops look to be twice as expensive after this
> > patch.
> > The numbers aren't quite adding up for me - it looks like we should
> > spend 200 more cycles per loop iteration, but the loops are like
> > 40,000
> > -> 90,000 cycles.
> > 
> > I'm not sure what to do with this information.  Eliminating 90% of
> > the
> > instructions seems good.  Requiring no scratch access seems good.
> > Eliminating the 67 memory loads outside of the loops seems good.
> > Doing two memory loads per loop doesn't seem too crazy, given that
> > it matches the GLSL source code.  Burning 49 registers to store the
> > entire array for the lifetime of the program seems pretty crazy...
> > 
> > --Ken
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list