Mesa (master): 29 new commits

Kenneth Graunke kwg at kemper.freedesktop.org
Mon Aug 19 20:17:25 UTC 2013


URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e197f5373037a972244e15b8453007dd165b9b35
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Aug 14 20:18:24 2013 -0700

    i965: Make the VS binding table as small as possible.
    
    For some reason, we didn't use this information even though the VS
    backend has computed it (albeit poorly) for ages.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=7e9559c9ba4dd82aca83b08d039103e38a3f94be
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Aug 14 20:25:40 2013 -0700

    i965/vs: Rework binding table size calculation.
    
    Unlike the FS, the VS backend already computed the binding table size.
    However, it did so poorly: after compilation, it looked to see if any
    pull constants/textures/UBOs were in use, and set num_surfaces to the
    maximum surface index for that category.  If the VS only used a single
    texture or UBO, this overcounted by quite a bit.
    
    The shader time surface was also noted at state upload time (during
    drawing), not at compile time, which is inefficient.  I believe it also
    had an off by one error.
    
    This patch computes it accurately, while also simplifying the code.
    
    It also renames num_surfaces to binding_table_size, since num_surfaces
    wasn't actually the number of surfaces used.  For example, a VS that
    used one UBO and no other surfaces would have set num_surfaces to
    SURF_INDEX_VS_UBO(1) == 18, rather than 1.  A bit of a misnomer there.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=c642bd3dcc1a6f1039732c614ab8a56dd3779427
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Aug 14 20:42:29 2013 -0700

    i965/vs: Plumb brw_vec4_prog_data into vec4_generator().
    
    This will be useful for the next commit.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=60689c05d1f69610b3daac1c9f407c75ebecc81b
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Aug 14 19:54:25 2013 -0700

    i965/fs: Make the FS binding table as small as possible.
    
    Computing the minimum size was easy, and done at compile-time for no
    extra overhead here.  Making the binding table smaller wastes less batch
    space.
    
    Adding the CACHE_NEW_WM_PROG dirty bit isn't strictly necessary, since
    other atoms depend on it and flag BRW_NEW_SURFACES.  However, it's best
    to add it for clarity and safety.  It shouldn't add any new overhead.
    
    v2: Use binding_table_size, rather than max_surface_index.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6d89bc803d83d27d8946afdd2f749334a41a9d1f
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Aug 14 19:49:33 2013 -0700

    i965/fs: Track the binding table size in brw_wm_prog_data.
    
    By tracking the maximum surface index used by the shader, we know just
    how small we can make the binding table.
    
    Since it depends entirely on the shader program, we can just compute
    it once at compile time, rather than at binding table emit time (which
    happens during drawing).
    
    v2: Store binding_table_size, rather than max_surface_index, for
        consistency with the VS (which needs to be able to represent 0
        surfaces).
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=7c717690b5594c768a18cc2a00364e5ec7bc20ab
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Aug 14 19:43:09 2013 -0700

    i965: Use SURF_INDEX_DRAW() for drawbuffer binding table indices.
    
    SURF_INDEX_DRAW() has been the identity function since the dawn of time,
    and both the shader code and binding table upload code relied on that,
    simply using X rather than SURF_INDEX_DRAW(X).
    
    Even if that continues to be true, using the macro clarifies the code.
    
    The comment about draw buffers needing to be first in order for
    headerless render target writes to work turned out to be wrong; with
    this change, SURF_INDEX_DRAW can be changed to arbitrary indices and
    everything continues working.
    
    The confusion was over the "Render Target Index" field in the FB write
    message header.  If it were a binding table index, then RT 0 would have
    to be at index 0 for headerless FB writes to work.  However, it's
    actually an index into the blend state table, so there's no problem.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Cc: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=c5fe7d063cc886ef1307f8ea58a301debed12fba
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Jul 9 15:17:15 2013 -0700

    i965: Shorten sampler loops in key setup.
    
    Now that we have the number of samplers available, we don't need to
    iterate over all 16.  This should be particularly helpful for vertex
    shaders.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=d0401d09ce19e47e01a3f1c86c10894515de26ad
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Jul 9 15:09:05 2013 -0700

    i965: Make sampler counts available for the entire drawing operation.
    
    Previously, we computed sampler counts when generating the SAMPLER_STATE
    table.  By computing it earlier, we should be able to shorten a bunch of
    loops.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=c6e572275b60f0221691b9b97650b9b41b89a5a2
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Wed Jun 26 15:42:43 2013 -0700

    i965: Split the brw_samplers atom into separate FS/VS stages.
    
    This allows us to avoid uploading the VS sampler state table if only the
    fragment program changes.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=7e01af662ad12bd2b27034f3ca7687e2986b5dbd
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Jun 25 22:14:04 2013 -0700

    i965: Upload separate VS and FS sampler state tables.
    
    Now, each shader stage has a sampler state table that only refers to the
    samplers actually used by that problem.  This should make the VS table
    non-existant or very small.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=2b7f876a6ad62ad9a93c0df15cb4be1fcc61d380
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Jun 25 22:29:19 2013 -0700

    i965: Make upload_sampler_state_table a virtual function.
    
    This allows us to coalesce the brw_samplers and gen7_samplers atoms.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=decc708c7c3ab53922cf3ac94cd74231196fd0cb
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Aug 21 23:54:19 2012 -0700

    i965: Upload separate per-stage sampler state tables.
    
    Also upload separate sampler default/texture border color entries.
    
    At the moment, this is completely idiotic: both tables contain exactly
    the same contents, so we're simply wasting batch space and CPU time.
    
    However, soon we'll only upload data for textures actually /used/ in
    a particular stage, which will usually make the VS table empty and
    very likely eliminate all redundancy.  This is just a stepping stone.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=9525bcf5f799ffdf6db4cfa41da0daee142e6d3a
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Aug 21 23:40:02 2012 -0700

    i965: Un-hardcode border color table from update_sampler_state().
    
    Like the previous patch, this simply pushes direct access to brw->wm up
    one level in the call chain.  Rather than passing the whole array, we
    just pass a pointer to the correct spot in the array, similar to what we
    do for the actual sampler state structure.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ed4459b10bda151de0d147936c848939c5da045a
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Aug 21 16:13:17 2012 -0700

    i965: Un-hardcode border color table from upload_default_color.
    
    When we begin uploading separate sampler state tables for VS and FS,
    we won't be able to use &brw->wm.sdc_offset[ss_index].  By passing it in
    as a parameter, we push the problem up to the caller.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=f5a690cb68d69c0279ab95ecb2d188ede13ada03
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Tue Aug 21 15:24:14 2012 -0700

    i965: Split sampler count variable to be per-stage.
    
    Currently, we only have a single sampler state table shared among all
    stages, so we just copy wm.sampler_count into vs.sampler_count.
    
    In the future, each shader stage will have its own SAMPLER_STATE table,
    at which point we'll need these separate sampler counts.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=44960ef918fff24cf7e49f4c89e845709aae3541
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:54:24 2013 -0700

    i965/fs: Re-enable global copy propagation.
    
    I believe the data flow analysis actually works now, and it should be
    safe to re-enable global copy propagation.  It even does things now.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=72f2249c115a6bfafc809ebb4cb78c860279e41f
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:44:25 2013 -0700

    i965/fs: Fix computation of livein.
    
    Since the initial value for livein is an overestimation (0xffffffff),
    it's extremely likely that it will shrink, which means we can't simply
    OR in new bits - we need to fully recompute it based on the current
    liveout values.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=70b02a7facf88d5f17655be5e17f053d8531a278
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:39:07 2013 -0700

    i965/fs: Fully recompute liveout at each step.
    
    Since we start with an overestimation of livein (0xffffffff), successive
    steps can actually take away values.  This means we can't simply OR in
    new liveout values; we need to recompute it from scratch at each
    iteration of the fixed-point algorithm.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=d20b472d0a6b016e4827d0986a10df29277a3a5e
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:47:19 2013 -0700

    i965/fs: Skip the initial block when updating livein/liveout.
    
    The starting block always has livein = 0 and liveout = copy.  Since we
    start with real data, not estimates, there's no need to refine it with
    the fixed point algorithm.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=731145c5794c2831a833778b0940c999273ec984
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Sun Aug 11 22:07:17 2013 -0700

    i965/fs: Drop unnecessary and incorrect liveout initialization.
    
    The previous commit properly initialized liveout.  This previous
    (and incorrect) initialization is no longer necessary.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=1d40c784f22dcbe814e7915d1fae45774a264526
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:34:11 2013 -0700

    i965/fs: Properly initialize the livein/liveout sets.
    
    Previously, livein was initialized to 0 for all blocks.  According to
    the textbook, it should be the universal set (~0) for all blocks except
    the one representing the start of the program (which should be 0).
    
    liveout also needs to be initialized to COPY for the initial block.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=f06826cece7ad6348c93760e473e5a35ad872431
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:36:54 2013 -0700

    i965/fs: Use the COPY set in the calculation for liveout.
    
    According to page 360 of the textbook, the proper formula for liveout
    is:
    
    CPout(n) = COPY(i) union (CPin(i) - KILL(i))
    
    Previously, we omitted COPY.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=a291c59bbae7d9d96487a984f81a298a1fd71389
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:27:22 2013 -0700

    i965/fs: Simplify liveout calculation.
    
    Excluding the existing liveout bits is a deviation from the textbook
    algorithm.  The reason for doing so was to determine if the value
    changed, which means the fixed-point algorithm needs to run for another
    iteration.
    
    The simpler way to do that is to save the value from step (N-1) and
    compare it to the new value at step N.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=597efd2b67d1afb8a95be38145c4f977ed36b672
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Thu Aug 8 23:29:56 2013 -0700

    i965/fs: Create the COPY() set for use in copy propagation dataflow.
    
    This is the "COPY" set from Muchnick's textbook, which is necessary
    to do the dataflow algorithm correctly.
    
    v2: Simplify initialization based on Paul Berry's observation that
        out_acp contains exactly what needs to be in the COPY set.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=669d4d7f77648948800abce59bc99a29a338a3ad
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:07:45 2013 -0700

    i965/fs: Rename setup_kills() to setup_initial_values().
    
    Although this function currently only initializes the KILL set, it will
    soon initialize other data flow sets as well.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=2ef81372dccc102d95b3dcec22b42406e1b55af9
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 17:53:05 2013 -0700

    i965/fs: Separate the updating of liveout/livein.
    
    To compute the actual liveout/livein data flow values, we start with
    some initial values and apply a fixed-point algorithm until they settle.
    
    Previously, we iterated through all blocks, updating both liveout and
    livein together in one pass.  This is awkward, since computing livein
    for a block requires knowing liveout for all parent blocks.  Not all
    of those parent blocks may have been processed yet.
    
    This patch separates the two.  First, we update liveout for all blocks.
    At iteration N of the fixed-point algorithm, this uses livein values
    from iteration N-1.  Secondly, we update livein for all blocks.  At
    step N, this uses the liveout information we just computed (in step N).
    
    This ensures each computation has a consistent picture of the data,
    rather than seeing an random mix of data from steps N-1 and N depending
    on the order of the blocks in the CFG data structure.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=7d86042dee17dfd985dcab098fc97838c11a5662
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:25:36 2013 -0700

    i965/fs: Rename "cont" to "progress" in dataflow algorithm.
    
    This variable indicates that the fixed-point algorithm made changes to
    the data at this step, so it needs to run for another iteration.
    
    "progress" seems a nicer name for that.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=0225dea6c49674a27d5be6e933447d8a4ba5a82e
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 17:50:03 2013 -0700

    i965/fs: Switch to a do-while loop in copy propagation dataflow.
    
    The fixed-point algorithm needs to run at least once, so a do-while loop
    is more natural.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=3c68662bb1d41727b6c53fd58868cdcfe6a98492
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Fri Aug 9 18:51:05 2013 -0700

    i965/fs: Skip global copy propagation step.
    
    The dataflow analysis used for global copy propagation is severely
    broken, and I believe it doesn't actually do anything.  Fixing it will
    require a lot of changes, each of which might break things.
    
    Once all the fixes land, we can re-enable this.
    
    Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
    Reviewed-by: Paul Berry <stereotype441 at gmail.com>




More information about the mesa-commit mailing list