[Mesa-dev] [PATCH 00/12] Improve GLSL preprocessor performance
Vladislav Egorov
vegorov180 at gmail.com
Sat Jan 7 19:02:01 UTC 2017
There is a lot of room for improvement in the preprocessor. Quick
benchmark on artificial 4Mb "shader" (16x concatenated Blender PBR
shader) of several popular C-like preprocessors (I wanted to also
add D's Warp, but didn't manage to compile it):
time mem page faults
clang 3.8 0.11s 32Mb 3K
gcc 5 0.063s 13Mb 1.3K
tcc 0.067s 3Mb 0.4K
glslangValidator 0.63s 25Mb 3K
glcpp (Mesa) 0.36s 127Mb 31K
glcpp+jemalloc 0.39s 182Mb 1K
Not only glcpp is significantly slower than other C-like
preprocessors (3x-6x slower), it allocates much more memory.
This patch series improves the preprocessor in the following ways:
1. Print to exponentially growing string instead of using printf()
and realloc() on each print.
2. Use Bloom filters to avoid excessive hash-table queries.
3. Create hand-written streamlined lexer/parser that bypasses
flex/bison tokenization/printing for simple cases. This one
is adds a lot of code, but it also greatly improves
preprocessing speed.
A few benchmarks. The same 16x concatenated Blender PBR shader:
time mem page faults
glcpp 0.36s 127Mb 31K
glcpp-new 0.026s 13Mb 2.7K
glcpp-new+jemalloc 0.026s 20Mb 1K
A nice improvement both in speed and amount of used memory.
More realistic test. Preprocessing my whole shader-db (more than
51K shaders from various Steam games) using shader-db's run and
glcpp hybrid I hacked together:
dumped from games default shader-db's collection
Before 27.02s 0.52s
After 2.09s 0.14s
However, some games benefit very little from this series (Talos
Principle 0.45s -> 0.2s, Serious Sam 0.53s -> 0.22s, to name
a few). They are heavy users of preprocessor, and they hit
non-optimized path. It's possible to improve them too streamlining
skipping path of #if 0 ... #endif blocks. It's also possible to
increase speed of the fast path using SIMD-optimizations (Clang
for example uses SSE to skip multiline comments).
The series passes all Mesa's preprocessor tests. The output and error
output of the preprocessor after full shader-db's run is the same,
including line numbers in errors and so on. The only difference that
it generates a bit less trailing whitespace, but trailing whitespace
doesn't really matter for preprocessor. Other preprocessors drop
trailing whitespace entirely.
Vladislav Egorov (12):
glcpp: Print preprocessor output to string_buffer
glcpp: Avoid unnecessary strcmp()
glcpp: Use Bloom filter before identifier search
glcpp: Use string_buffer for continuations removal
ralloc: Avoid calling vsnprintf() twice
ralloc: Use strnlen() inside of strncat()
glcpp: Skip unnecessary line continuations removal
glcpp: Use strpbrk in the line continuations pass
glcpp: Avoid unnecessary linear_strdup
glcpp/tests: Allow different trailing whitespace
glcpp: Create fast path hand-written scanner
glcpp: Substitute trivial macros in the fast path
src/compiler/glsl/glcpp/glcpp-lex.l | 428 ++++++++++++++++++++++++++++++-
src/compiler/glsl/glcpp/glcpp-parse.y | 149 ++++++-----
src/compiler/glsl/glcpp/glcpp.h | 78 +++++-
src/compiler/glsl/glcpp/pp.c | 242 +++++++++++++----
src/compiler/glsl/glcpp/tests/glcpp-test | 4 +-
src/util/ralloc.c | 64 +++--
6 files changed, 820 insertions(+), 145 deletions(-)
--
2.7.4
More information about the mesa-dev
mailing list