[Mesa-dev] [PATCH 00/14] More substantial ff_fragment_shader cache key optimizations.
Gustaw Smolarczyk
wielkiegie at gmail.com
Thu Mar 30 18:09:19 UTC 2017
Hello,
This is the continuation of my ff_fragment_shader cache key optimizations.
I have continued to try to reduce overhead of make_state_key function and it
seems that I have gained a little bit. As this is the first time I have
ventured into the mesa codebase so much, it's possible that I did something
wrong along the way. Please, point it out if you find anything incorrect.
For example, I was a little bit confused by the indentation used in some code
parts (like using only spaces or a mix of spaces and tabs). I tried to preserve
the indentation of the files I modified.
As before, the number of patches might be a little bit high since some of them
are very simple. I might squash some of them if you prefer that.
The first 3 patches are a rebased resend of a previous series. I have kept
Eric's r-by (I have changed the commit message of these a little bit, I hope
keeping r-by was ok).
Patches 4-9 contain simple self-contained improvements to the cache key and
its computation.
Patches 10 and 11 try to move some of the state computation to the point it is
changed. I have added a couple of compressed state fields into the context
object.
Patches 12 and 13 use these new fields inside make_state_key, simplifying it
a lot. Along the way, I have fixed an apparent bug (GL_ONE was not handled as
a combine source), though there was no difference for piglit quick run.
Finally, patch 14 uses the new compressed fog state for atifs state handling
in st/mesa, since it was quite simple to modify it. I didn't bother using
the new state for classic dri drivers.
I have run a piglit quick test on radeonsi before and after the series and
there were no differences apart from some unstable test results.
As for performance measurements, I have run a simple minecraft apitrace
through perf-record 5 times and have found that:
1. The apitrace replay fps measure is too variable to show any difference.
It can be passed as "a wash".
2. perf-report shows something more encouraging. The time spent in
_mesa_get_fixed_func_fragment_program has dropped from ~0.78% to ~0.37%.
Standard deviation here is ~0.025% so the performance gain is statistically
significant.
Regards,
Gustaw
Gustaw Smolarczyk (14):
mesa/main/ff_frag: Use correct constant.
mesa/main/ff_frag: Remove enabled_units.
mesa/main/ff_frag: Reduce the size of nr_enabled_units.
mesa/main/ff_frag: Remove unused struct.
mesa/main/ff_frag: Don't bother with VARYING_BIT_FOGC.
mesa/main/ff_frag: Simplify get_fp_input_mask.
mesa/main/ff_frag: Store nr_enabled_units only once.
mesa/main/ff_frag: Use gl_texture_object::TargetIndex.
mesa/main/ff_frag: Don't retrieve format if not necessary.
mesa/main: Maintain compressed fog mode.
mesa/main: Maintain compressed TexEnv Combine state.
mesa/main/ff_frag: Use compressed fog mode.
mesa/main/ff_frag: Use compressed TexEnv Combine state.
st/mesa: Use compressed fog mode for atifs.
src/mesa/main/enable.c | 1 +
src/mesa/main/ff_fragment_shader.cpp | 506 ++++++++++--------------------
src/mesa/main/fog.c | 9 +
src/mesa/main/mtypes.h | 97 ++++++
src/mesa/main/texstate.c | 103 ++++++
src/mesa/state_tracker/st_atifs_to_tgsi.c | 6 +-
src/mesa/state_tracker/st_atom_shader.c | 17 +-
7 files changed, 388 insertions(+), 351 deletions(-)
--
2.12.1
More information about the mesa-dev
mailing list