[Bug 77957] Variably-indexed constant arrays result in terrible shader code

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Nov 6 23:58:30 PST 2014


https://bugs.freedesktop.org/show_bug.cgi?id=77957

Kenneth Graunke <kenneth at whitecape.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED
           Assignee|jason at jlekstrand.net        |kenneth at whitecape.org

--- Comment #5 from Kenneth Graunke <kenneth at whitecape.org> ---
Fixed - with my new lowering pass, it no longer generates terrible code.

commit 4f22db5fbbe59eacb762aa410f18c3078e85c2b7
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Sat Apr 26 00:18:54 2014 -0700

    glsl: Lower constant arrays to uniform arrays.

    Consider GLSL code such as:

       const ivec2 offsets[] =
          ivec2[](ivec2(-1, -1), ivec2(-1, 0), ivec2(-1, 1),
                  ivec2(0, -1),  ivec2(0, 0),  ivec2(0, 1),
                  ivec2(1, -1),  ivec2(1, 0),  ivec2(1, 1));

       ivec2 offset = offsets[<non-constant expression>];

    Both i965 and nv50 currently handle this very poorly.  On i965, this
    becomes a pile of MOVs to load the immediate constants into registers,
    a pile of scratch writes to move the whole array to memory, and one
    scratch read to actually access the value - effectively the same as if
    it were a non-constant array.

    We'd much rather upload large blocks of constant data as uniform data,
    so drivers can simply upload the data via constbufs, and not have to
    populate it via shader instructions.

    This is currently non-optional because both i965 and nouveau benefit
    from it, and according to Marek radeonsi would benefit today as well.
    (According to Tom, radeonsi may want to handle this itself in the long
    term, but we can always add a flag when it becomes useful.)

    Improves performance in a terrain rendering microbenchmark by about 2x,
    and cuts the number of instructions in about half.  Helps a lot of
    "Natural Selection 2" shaders, as well as one "HOARD" shader.

    total instructions in shared programs: 5473459 -> 5471765 (-0.03%)
    instructions in affected programs:     5880 -> 4186 (-28.81%)

    v2: Use ir_var_hidden to avoid exposing the new uniform via the GL
        uniform introspection API.

    v3: Alphabetize Makefile.sources properly.

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77957
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20141107/e490ca3b/attachment.html>


More information about the intel-3d-bugs mailing list