[Mesa-dev] [PATCH 00/14] More substantial ff_fragment_shader cache key optimizations.

Thu Mar 30 18:09:19 UTC 2017

Hello,

This is the continuation of my ff_fragment_shader cache key optimizations.
I have continued to try to reduce overhead of make_state_key function and it
seems that I have gained a little bit. As this is the first time I have
ventured into the mesa codebase so much, it's possible that I did something
wrong along the way. Please, point it out if you find anything incorrect.
For example, I was a little bit confused by the indentation used in some code
parts (like using only spaces or a mix of spaces and tabs). I tried to preserve
the indentation of the files I modified.

As before, the number of patches might be a little bit high since some of them
are very simple. I might squash some of them if you prefer that.

The first 3 patches are a rebased resend of a previous series. I have kept
Eric's r-by (I have changed the commit message of these a little bit, I hope
keeping r-by was ok).

Patches 4-9 contain simple self-contained improvements to the cache key and
its computation.

Patches 10 and 11 try to move some of the state computation to the point it is
changed. I have added a couple of compressed state fields into the context
object.

Patches 12 and 13 use these new fields inside make_state_key, simplifying it
a lot. Along the way, I have fixed an apparent bug (GL_ONE was not handled as
a combine source), though there was no difference for piglit quick run.

Finally, patch 14 uses the new compressed fog state for atifs state handling
in st/mesa, since it was quite simple to modify it. I didn't bother using
the new state for classic dri drivers.

I have run a piglit quick test on radeonsi before and after the series and
there were no differences apart from some unstable test results.

As for performance measurements, I have run a simple minecraft apitrace
through perf-record 5 times and have found that:

1. The apitrace replay fps measure is too variable to show any difference.
   It can be passed as "a wash".
2. perf-report shows something more encouraging. The time spent in
   _mesa_get_fixed_func_fragment_program has dropped from ~0.78% to ~0.37%.
   Standard deviation here is ~0.025% so the performance gain is statistically
   significant.

Regards,
Gustaw

Gustaw Smolarczyk (14):
  mesa/main/ff_frag: Use correct constant.
  mesa/main/ff_frag: Remove enabled_units.
  mesa/main/ff_frag: Reduce the size of nr_enabled_units.
  mesa/main/ff_frag: Remove unused struct.
  mesa/main/ff_frag: Don't bother with VARYING_BIT_FOGC.
  mesa/main/ff_frag: Simplify get_fp_input_mask.
  mesa/main/ff_frag: Store nr_enabled_units only once.
  mesa/main/ff_frag: Use gl_texture_object::TargetIndex.
  mesa/main/ff_frag: Don't retrieve format if not necessary.
  mesa/main: Maintain compressed fog mode.
  mesa/main: Maintain compressed TexEnv Combine state.
  mesa/main/ff_frag: Use compressed fog mode.
  mesa/main/ff_frag: Use compressed TexEnv Combine state.
  st/mesa: Use compressed fog mode for atifs.

 src/mesa/main/enable.c                    |   1 +
 src/mesa/main/ff_fragment_shader.cpp      | 506 ++++++++++--------------------
 src/mesa/main/fog.c                       |   9 +
 src/mesa/main/mtypes.h                    |  97 ++++++
 src/mesa/main/texstate.c                  | 103 ++++++
 src/mesa/state_tracker/st_atifs_to_tgsi.c |   6 +-
 src/mesa/state_tracker/st_atom_shader.c   |  17 +-
 7 files changed, 388 insertions(+), 351 deletions(-)

-- 
2.12.1