[Mesa-dev] [RFC PATCH 00/10] glsl/loops: Eliminate loop control fields.

Paul Berry stereotype441 at gmail.com
Sat Nov 30 08:38:09 PST 2013


Currently, when loop analysis detects a bounded loop (one which is
known to no more than N times), it converts it into a "Fortran-style"
loop, in which explicit fields in ir_loop record the loop counter, the
from, to, and increment values, and the type of comparison used to
terminate the loop, rather than using an representing them implicitly
using assignments and a conditional break.  These explicit fields in
ir_loop are collectively referred to as the "loop control" fields.

Fortran-style loops have the advantage that they're easy to unroll
(since the loop termination condition is nicely separated from the
body of the loop), however they have the two disadvantages:

(1) Many optimization passes (for example constant folding and cse)
    don't correctly analyze Fortran-style loops.  If a loop control
    variable were ever referenced inside the body of the loop, or
    after the loop ended, those optimization passes would misbehave.
    (Note: before commit 9d2951e, even dead code elimination didn't
    analyze Fortran-style loops properly, and it was only other bugs
    that prevented loop counter variables from being erroneously
    eliminated).

(2) To work around (1), loop analysis always synthesizes a new
    variable to store in ir_loop::counter.  As a result, many loops
    wind up getting compiled with redundant loop counters, which costs
    instructions and adds to register pressure.

For an example of (2), consider how the following loop gets compiled:

    for (int i = 0; i < 100; i++) {
      total += i;
    }

After some preliminary optimization, it looks like this (eliding dead
variables):

    (assign (x) (var_ref i) (constant int (0)))
    (loop () () () () (
      (if (expression bool >= (var_ref i) (constant int (100)))
        (break)
        ())
      (assign (x) (var_ref total)
                  (expression int + (var_ref total) (var_ref i)))
      (assign (x) (var_ref i)
                  (expression int + (var_ref i) (constant int (1))))))

The loop_controls pass transforms it into this:

    (assign (x) (var_ref i) (constant int (0)))
    (loop ((declare () int i at 9)) ((constant int (0)))
          ((constant int (100))) ((constant int (1))) (
      (assign (x) (var_ref total)
                  (expression int + (var_ref total) (var_ref i)))
      (assign (x) (var_ref i)
                  (expression int + (var_ref i) (constant int (1))))))

Note that there are now two redundant counter variables: i and i at 9.
We don't have any optimization pass which is capable of detecting the
redundancy, so as a result the back-end has to use an extra register,
and generate redundant instructions.


This patch solves the problem by eliminating the loop control fields
from ir_loop (effectively preventing us from ever representing a loop
in Fortran-style).  The loop_controls pass still exists, but it
doesn't create Fortran-style loops anymore; instead it merely
eliminates zero-iteration loops and removes redundant loop
terminators.  Loop unrolling no longer requires loops to be in
Fortran-style; it is capable of unrolling loops directly from their
non-Fortran form.

Patch 1 consolidates the back-end code to deal with Fortran-style
loops into a lowering pass, so that this code isn't duplicated in
every back-end anymore.  Patch 2 simplifies the loop controls to a
single integer bound rather than separate
counter/from/to/increment/cmp fields (so that instead of representing
bounded loops in Fortran-style, they are now simply loops with a
normative iteration bound).  Patches 3-5 do some clean-up work.  Patch
6 moves some analysis code from loop_controls to loop_analysis so that
the analysis results can be re-used by loop unrolling.  Patches 7-8 do
more clean-up.  Patch 9 is the crux of the series--it changes
loop_controls to stop generating Fortran-style loops, and modifies
loop unrolling to handle non-Fortran-style loops.  Finally, patch 10
removes the integer bound that was introduced in patch 2, and the
lowering pass that was introduced in patch 1, since they are no longer
used.

Note: on i965, patch 1 introduces two extra instructions to every
bounded loop, and patch 9 removes two instructions from every bounded
loop, so this series produces no change in shader-db.  However, I
think the series is still worth doing, since the extra instructions
introduced by patch 1 can be easily cleaned up by a future peephole
optimization, whereas the instructions eliminated by patch 9 would
have been difficult to eliminate if we kept using Fortran-style loops.

Note: I'm aware that when loop_analysis and loop_controls were first
written, there were future improvements planned which would have
rewritten redundant loop induction variables in terms of each other.
So, for example, this loop:

    for (i = 0, j = 9; i < 10; i++, j--)
      do_stuff(i, j);

might get rewritten to this:

    for (i = 0; i < 10; i++)
      do_stuff(i, 9 - i);

I don't believe that anything in this patch series closes the door to
those kinds of future improvements, since all the information
necessary to do them is still recorded by loop_analysis; it simply
lives in the loop analysis data structures now rather than in the IR
tree.

Note: This series must be applied atop the series beginning with
"[PATCH 1/4] glsl: Extract functions from
loop_analysis::visit(ir_dereference_variable *).", which is still
under review.  To see the series in context, check out branch
loop_control_rework from https://github.com/stereotype441/mesa.git.


[RFC PATCH 01/10] glsl/loops: consolidate bounded loop handling into a lowering pass.
[RFC PATCH 02/10] glsl/loops: replace loop controls with a normative bound.
[RFC PATCH 03/10] glsl/loops: Remove unused fields iv_scale and biv from loop_variable class.
[RFC PATCH 04/10] glsl/loops: Remove unnecessary list walk from loop_control_visitor.
[RFC PATCH 05/10] glsl/loops: Allocate loop_terminator using new(mem_ctx) syntax.
[RFC PATCH 06/10] glsl/loops: Move some analysis from loop_controls to loop_analysis.
[RFC PATCH 07/10] glsl/loops: Simplify loop unrolling logic by breaking into functions.
[RFC PATCH 08/10] glsl/loops: Get rid of loop_variable_state::max_iterations.
[RFC PATCH 09/10] glsl/loops: Stop creating normatively bound loops in loop_controls.
[RFC PATCH 10/10] glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound.


More information about the mesa-dev mailing list