[Mesa-dev] i965 SamplerUnits rework
Ian Romanick
idr at freedesktop.org
Fri Aug 24 10:21:51 PDT 2012
On 08/24/2012 03:05 AM, Kenneth Graunke wrote:
> Greetings!
>
> This series reworks how i965 deals with sampler indirections, changing it
> to use linker-assigned sampler variable IDs in SEND instructions rather
> than baking in the ID of the texture unit they happen to be bound to.
>
> Instead, it now encodes that mapping in the binding, sampler state,
> and sampler default color tables, which are updated and re-emitted at least
> once per batch anyway.
>
> This has several advantages:
> - We no longer need to recompile fragment shaders every time an application
> calls glUniform1i to set their sampler uniforms' values.
>
> - The game "Cogs" (from Humble Bundle 3) drops from 99% CPU usage (as it
> continually recompiles fragment shaders due to ping-ponging between
> texture units 0 and 1) to a mere 30%. (It's still slow, but that's an
> unrelated issue.)
>
> This also fixes that issue for Gallium drivers, which may make it
> playable on Radeon and Nouveau.
>
> - Our sampler state and sampler default color tables are now compact,
> only containing as many entries as necessary, rather than covering all
> texture units (sparsely, whether used or not).
>
> - Without this change, fragment shader precompiles are basically useless:
> we compiled assembly at glLinkShader() time, before the application had
> a chance to call glUniform1i() to bind sampler variables to texture units.
> When it does, we would get a ProgramStringNotify, bumping the ID in the
> program key and making our nice precompiled shader irrelevant for eternity.
>
> - I think it'll be helpful for Haswell.
>
> The only downside I see is that the tables now depend on the active
> program, which means we may need to re-emit them more often. The cost
> of emitting state is much, much lower than recompiling, and we already
> re-emit these at least once per batch anyway, so it shouldn't be too bad.
>
> There are a few quirky aspects to think about:
> - This doesn't eliminate all the texture-related NOS (non-orthagonal state):
> DEPTH_TEXTURE_MODE and EXT_texture_swizzle are properties of the currently
> bound textures, and require extra instructions to do the swizzling.
>
> We still listen to _NEW_TEXTURE and recompute the program keys. However,
> unless the app has actually /changed/ the swizzling, it should be a cache
> hit, and we won't have to recompile.
>
> - No changes are necessary in the old brw_vs_emit/brw_wm_emit backends:
> ARB_fragment_program uses texture unit numbers directly, so SamplerUnits
> is actually the identity mapping. Fixed function fragment shading uses
> the new brw_fs backend. Pre-GLSL, vertex processing couldn't use textures.
>
> Those who talked to me earlier might be surprised to see that this doesn't
> have separate VS/FS sampler state tables. I realized that since the
> indices are actually part of the linked program, we basically already have
> a single combined table anyway; splitting them actually just wasted space
> (and required a lot of unnecessary code churn).
>
> No Piglit regressions on Sandybridge. Untested on Ivybridge.
> And no, I haven't run oglconform.
>
> Please review. It would be great to get this into 9.0.
Patches 1 through 6 are
Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
I probably won't have a chance to get through the rest, so don't wait on me.
More information about the mesa-dev
mailing list