[Mesa-dev] V3 On disk shader cache for i965 (Now with real world results!)

Timothy Arceri timothy.arceri at collabora.com
Sun Jun 26 14:46:07 UTC 2016


On Sun, 2016-06-26 at 16:15 +0300, Grazvydas Ignotas wrote:
> Tried this while playing with apitrace and am getting segfaults when
> running any trace with a cached (second) run. Not sure if it's
> "wrong"
> traces I've chosen or what, you can take one example from this bug:
> https://bugs.freedesktop.org/show_bug.cgi?id=96425

Thanks for testing I'll take a look tomorrow.

> 
> It would also be good idea to hide the cache debug messages behind
> some env var, or at least send them to stderr and not stdout, as
> stdout breaks programs that pipe data through stdout like qapitrace.

Right thats my next task, I should get this done tomorrow also. As
stated below :) "For now I have left in some printf's as the feature is
still disabled by default and they are useful for debugging. I intend
to fix this soon to hide them behind an environment var."

Thanks again.

> 
> Gražvydas
> 
> On Sun, Jun 26, 2016 at 7:16 AM, Timothy Arceri
> <timothy.arceri at collabora.com> wrote:
> > I've spent a bunch of time rebasing this series to remove the
> > excess
> > code churn and I've just pushed the results to the shader-cache
> > branch
> > mentioned below. There are no code changes to the end result but
> > I've
> > managed to get the patch count down to 80 (was 96 i think) and
> > things
> > should be much easier to review now.
> > 
> > I've also had reports of people testing with additional games such
> > as
> > Dota 2 and seeing good results.
> > 
> > 
> > On Tue, 2016-06-21 at 16:08 +1000, Timothy Arceri wrote:
> > > Rather than send 90+ patches to the list. Please see the repo at
> > > the
> > > bottom of this email.
> > > 
> > > The big update is I've added all stages but compute and tested
> > > with a
> > > few games and everything seems to be working well so far.
> > > Enabling
> > > shader cache with the Shadow of Mordor benchmark make things
> > > noticeably
> > > smoother and helps consitently keep the min FPS at 15 on my
> > > Skylake,
> > > were as without it can be anywhere between 4-15.
> > > 
> > > The elemental demo which Dave pointed out as also doing a bunch
> > > of
> > > compiles during the demo is also smoother especially on the
> > > second
> > > run
> > > but its really slow on my Skylake regardless. Maybe someone with
> > > a
> > > highend Skylake would like to give it a try.
> > > 
> > > 
> > > V3:
> > > - add support for geometry and tessellation stages
> > > - cache clip planes
> > > - reserve parameter storage before restoring list
> > > - stop losing  buffer blocks on cache fallback
> > > - lots of little fixes I cant remember
> > > 
> > > V2:
> > > - rebased on master
> > > - add support for encoding doubles
> > > - renamed skip_cache params to is_cache_fallback, and fix related
> > > bug
> > > when
> > >  disabling shader cache for xfb.
> > > 
> > > This series is based on the great work done by Carl, Kristian and
> > > others.
> > > 
> > > I've split up Carls original patches for easier review, and also
> > > merged
> > > a number of fixes and clean-ups into his patches. However there
> > > is a
> > > little more code churn than is ideal as the appoach taken by the
> > > original patches needed to be modified quite a lot, I'm hoping
> > > its
> > > not
> > > more than people can live with as I'd like to keep some of the
> > > history
> > > rather than just squashing everything.
> > > 
> > > For now I have left in some printf's as the feature is still
> > > disabled
> > > by default and they are useful for debugging. I intend to fix
> > > this
> > > soon
> > > to hide them behind an environment var.
> > > 
> > > There are no regressions after two runs of piglit with shader
> > > cache
> > > enabled on my Broadwell machine.
> > > 
> > > This series enables on disk shader cache for all stage except
> > > compute
> > > programs. For now transform feedback, and SSO programs skip using
> > > the
> > > cache, these will be added as follow ups.
> > > 
> > > My main goal with this series is to land something that
> > > passes piglit there is a number of optimisations that can still
> > > be
> > > done
> > > such as skipping more validation and state recreation when
> > > falling
> > > back
> > > to a full recompile but I would rather leave this until we have
> > > something fully working.
> > > 
> > > Here are the shader-db times (from V2):
> > > 
> > > Cache disabled:
> > > 
> > > Thread 1 took 1360.47 seconds and compiled 13015 shaders (not
> > > including
> > > SIMD16) with 50 GL context switches
> > > Thread 3 took 1349.85 seconds and compiled 12848 shaders (not
> > > including
> > > SIMD16) with 40 GL context switches
> > > Thread 2 took 1362.94 seconds and compiled 12637 shaders (not
> > > including
> > > SIMD16) with 36 GL context switches
> > > Thread 0 took 1352.41 seconds and compiled 12593 shaders (not
> > > including
> > > SIMD16) with 46 GL context switches
> > > 
> > > Cache enabled first run:
> > > 
> > > Thread 1 took 1410.30 seconds and compiled 12678 shaders (not
> > > including
> > > SIMD16) with 34 GL context switches
> > > Thread 2 took 1421.35 seconds and compiled 12822 shaders (not
> > > including
> > > SIMD16) with 50 GL context switches
> > > Thread 0 took 1410.49 seconds and compiled 12999 shaders (not
> > > including
> > > SIMD16) with 40 GL context switches
> > > Thread 3 took 1426.67 seconds and compiled 12594 shaders (not
> > > including
> > > SIMD16) with 48 GL context switches
> > > 
> > > Cache enabled second run:
> > > 
> > > Thread 0 took 259.84 seconds and compiled 12817 shaders (not
> > > including
> > > SIMD16) with 40 GL context switches
> > > Thread 3 took 257.03 seconds and compiled 12533 shaders (not
> > > including
> > > SIMD16) with 50 GL context switches
> > > Thread 1 took 256.18 seconds and compiled 12828 shaders (not
> > > including
> > > SIMD16) with 40 GL context switches
> > > Thread 2 took 261.31 seconds and compiled 12915 shaders (not
> > > including
> > > SIMD16) with 39 GL context switches
> > > 
> > > You can find the series in the shader-cache branch of:
> > > 
> > > https://github.com/tarceri/Mesa_arrays_of_arrays.git
> > > 
> > > MESA_GLSL_CACHE_ENABLE=1 enables the cache.
> > > 
> > > 
> > > 
> > > 
> > > _______________________________________________
> > > mesa-dev mailing list
> > > mesa-dev at lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > _______________________________________________
> > > mesa-dev mailing list
> > > mesa-dev at lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list