[Mesa-dev] i965 SamplerUnits rework

Ian Romanick idr at freedesktop.org
Fri Aug 24 10:21:51 PDT 2012


On 08/24/2012 03:05 AM, Kenneth Graunke wrote:
> Greetings!
>
> This series reworks how i965 deals with sampler indirections, changing it
> to use linker-assigned sampler variable IDs in SEND instructions rather
> than baking in the ID of the texture unit they happen to be bound to.
>
> Instead, it now encodes that mapping in the binding, sampler state,
> and sampler default color tables, which are updated and re-emitted at least
> once per batch anyway.
>
> This has several advantages:
> - We no longer need to recompile fragment shaders every time an application
>    calls glUniform1i to set their sampler uniforms' values.
>
> - The game "Cogs" (from Humble Bundle 3) drops from 99% CPU usage (as it
>    continually recompiles fragment shaders due to ping-ponging between
>    texture units 0 and 1) to a mere 30%.  (It's still slow, but that's an
>    unrelated issue.)
>
>    This also fixes that issue for Gallium drivers, which may make it
>    playable on Radeon and Nouveau.
>
> - Our sampler state and sampler default color tables are now compact,
>    only containing as many entries as necessary, rather than covering all
>    texture units (sparsely, whether used or not).
>
> - Without this change, fragment shader precompiles are basically useless:
>    we compiled assembly at glLinkShader() time, before the application had
>    a chance to call glUniform1i() to bind sampler variables to texture units.
>    When it does, we would get a ProgramStringNotify, bumping the ID in the
>    program key and making our nice precompiled shader irrelevant for eternity.
>
> - I think it'll be helpful for Haswell.
>
> The only downside I see is that the tables now depend on the active
> program, which means we may need to re-emit them more often.  The cost
> of emitting state is much, much lower than recompiling, and we already
> re-emit these at least once per batch anyway, so it shouldn't be too bad.
>
> There are a few quirky aspects to think about:
> - This doesn't eliminate all the texture-related NOS (non-orthagonal state):
>    DEPTH_TEXTURE_MODE and EXT_texture_swizzle are properties of the currently
>    bound textures, and require extra instructions to do the swizzling.
>
>    We still listen to _NEW_TEXTURE and recompute the program keys.  However,
>    unless the app has actually /changed/ the swizzling, it should be a cache
>    hit, and we won't have to recompile.
>
> - No changes are necessary in the old brw_vs_emit/brw_wm_emit backends:
>    ARB_fragment_program uses texture unit numbers directly, so SamplerUnits
>    is actually the identity mapping.  Fixed function fragment shading uses
>    the new brw_fs backend.  Pre-GLSL, vertex processing couldn't use textures.
>
> Those who talked to me earlier might be surprised to see that this doesn't
> have separate VS/FS sampler state tables.  I realized that since the
> indices are actually part of the linked program, we basically already have
> a single combined table anyway; splitting them actually just wasted space
> (and required a lot of unnecessary code churn).
>
> No Piglit regressions on Sandybridge.  Untested on Ivybridge.
> And no, I haven't run oglconform.
>
> Please review.  It would be great to get this into 9.0.

Patches 1 through 6 are

Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>

I probably won't have a chance to get through the rest, so don't wait on me.


More information about the mesa-dev mailing list