[Mesa-dev] V3 On disk shader cache for i965 (Now with real world results!)

Grazvydas Ignotas notasas at gmail.com
Sun Jun 26 13:15:57 UTC 2016


Tried this while playing with apitrace and am getting segfaults when
running any trace with a cached (second) run. Not sure if it's "wrong"
traces I've chosen or what, you can take one example from this bug:
https://bugs.freedesktop.org/show_bug.cgi?id=96425

It would also be good idea to hide the cache debug messages behind
some env var, or at least send them to stderr and not stdout, as
stdout breaks programs that pipe data through stdout like qapitrace.

GraÅžvydas

On Sun, Jun 26, 2016 at 7:16 AM, Timothy Arceri
<timothy.arceri at collabora.com> wrote:
> I've spent a bunch of time rebasing this series to remove the excess
> code churn and I've just pushed the results to the shader-cache branch
> mentioned below. There are no code changes to the end result but I've
> managed to get the patch count down to 80 (was 96 i think) and things
> should be much easier to review now.
>
> I've also had reports of people testing with additional games such as
> Dota 2 and seeing good results.
>
>
> On Tue, 2016-06-21 at 16:08 +1000, Timothy Arceri wrote:
>> Rather than send 90+ patches to the list. Please see the repo at the
>> bottom of this email.
>>
>> The big update is I've added all stages but compute and tested with a
>> few games and everything seems to be working well so far. Enabling
>> shader cache with the Shadow of Mordor benchmark make things
>> noticeably
>> smoother and helps consitently keep the min FPS at 15 on my Skylake,
>> were as without it can be anywhere between 4-15.
>>
>> The elemental demo which Dave pointed out as also doing a bunch of
>> compiles during the demo is also smoother especially on the second
>> run
>> but its really slow on my Skylake regardless. Maybe someone with a
>> highend Skylake would like to give it a try.
>>
>>
>> V3:
>> - add support for geometry and tessellation stages
>> - cache clip planes
>> - reserve parameter storage before restoring list
>> - stop losing  buffer blocks on cache fallback
>> - lots of little fixes I cant remember
>>
>> V2:
>> - rebased on master
>> - add support for encoding doubles
>> - renamed skip_cache params to is_cache_fallback, and fix related bug
>> when
>>  disabling shader cache for xfb.
>>
>> This series is based on the great work done by Carl, Kristian and
>> others.
>>
>> I've split up Carls original patches for easier review, and also
>> merged
>> a number of fixes and clean-ups into his patches. However there is a
>> little more code churn than is ideal as the appoach taken by the
>> original patches needed to be modified quite a lot, I'm hoping its
>> not
>> more than people can live with as I'd like to keep some of the
>> history
>> rather than just squashing everything.
>>
>> For now I have left in some printf's as the feature is still disabled
>> by default and they are useful for debugging. I intend to fix this
>> soon
>> to hide them behind an environment var.
>>
>> There are no regressions after two runs of piglit with shader cache
>> enabled on my Broadwell machine.
>>
>> This series enables on disk shader cache for all stage except compute
>> programs. For now transform feedback, and SSO programs skip using the
>> cache, these will be added as follow ups.
>>
>> My main goal with this series is to land something that
>> passes piglit there is a number of optimisations that can still be
>> done
>> such as skipping more validation and state recreation when falling
>> back
>> to a full recompile but I would rather leave this until we have
>> something fully working.
>>
>> Here are the shader-db times (from V2):
>>
>> Cache disabled:
>>
>> Thread 1 took 1360.47 seconds and compiled 13015 shaders (not
>> including
>> SIMD16) with 50 GL context switches
>> Thread 3 took 1349.85 seconds and compiled 12848 shaders (not
>> including
>> SIMD16) with 40 GL context switches
>> Thread 2 took 1362.94 seconds and compiled 12637 shaders (not
>> including
>> SIMD16) with 36 GL context switches
>> Thread 0 took 1352.41 seconds and compiled 12593 shaders (not
>> including
>> SIMD16) with 46 GL context switches
>>
>> Cache enabled first run:
>>
>> Thread 1 took 1410.30 seconds and compiled 12678 shaders (not
>> including
>> SIMD16) with 34 GL context switches
>> Thread 2 took 1421.35 seconds and compiled 12822 shaders (not
>> including
>> SIMD16) with 50 GL context switches
>> Thread 0 took 1410.49 seconds and compiled 12999 shaders (not
>> including
>> SIMD16) with 40 GL context switches
>> Thread 3 took 1426.67 seconds and compiled 12594 shaders (not
>> including
>> SIMD16) with 48 GL context switches
>>
>> Cache enabled second run:
>>
>> Thread 0 took 259.84 seconds and compiled 12817 shaders (not
>> including
>> SIMD16) with 40 GL context switches
>> Thread 3 took 257.03 seconds and compiled 12533 shaders (not
>> including
>> SIMD16) with 50 GL context switches
>> Thread 1 took 256.18 seconds and compiled 12828 shaders (not
>> including
>> SIMD16) with 40 GL context switches
>> Thread 2 took 261.31 seconds and compiled 12915 shaders (not
>> including
>> SIMD16) with 39 GL context switches
>>
>> You can find the series in the shader-cache branch of:
>>
>> https://github.com/tarceri/Mesa_arrays_of_arrays.git
>>
>> MESA_GLSL_CACHE_ENABLE=1 enables the cache.
>>
>>
>>
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list