[Mesa-dev] [PATCH 00/15] GLSL memory allocation rework for faster compilation
Tapani Pälli
tapani.palli at intel.com
Mon Oct 10 05:58:14 UTC 2016
On 10/08/2016 06:58 PM, Jason Ekstrand wrote:
> FYI, we use ralloc for a lot more than just the glsl compiler so the
> first few changes make me a bit nervous. There was someone working on
> making our driver more I undefined-memory-friendly but I don't know what
> happened to those patches.
There's bunch of patches like that in this series:
https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html
it looks like it just never landed as would have required more testing
on misc drivers?
>
> On Oct 8, 2016 3:58 AM, "Marek Olšák" <maraeo at gmail.com
> <mailto:maraeo at gmail.com>> wrote:
>
> Hi,
>
> This patch series reduces the number of malloc calls in the GLSL
> compiler by 63%. That leads to better compile times and less heap
> thrashing.
>
> It's done by switching memory allocations in the GLSL compiler to my
> new linear allocator that allocates out of a fixed-sized buffer with
> a monotonically increasing offset. If more buffers are needed, it
> chains them.
>
> The new allocator is used in all places where short-lived allocations
> are used with a high number of malloc calls. The series also contains
> other improvements not related to the new allocator that also improve
> compile times. The results are below.
>
> I tested my shader-db with shaders only being compiled to TGSI.
> (noop gallium driver)
>
>
> master + libc's malloc:
>
> real 0m54.182s
> user 3m33.640s
> sys 0m0.620s
> maxmem 275 MB
>
>
> master + jemalloc preloaded:
>
> real 0m45.044s
> user 2m56.356s
> sys 0m1.652s
> maxmem 284 MB
>
>
> the series + libc's malloc:
>
> real 0m46.221s
> user 3m2.080s
> sys 0m0.544s
> maxmem 270 MB
>
>
> the series + jemalloc preloaded:
>
> real 0m40.729s
> user 2m39.564s
> sys 0m1.232s
> maxmem 284 MB
>
>
> The series without jemalloc almost caught up with jemalloc + master.
> However, jemalloc also benefits.
>
> Current Mesa needs 54.182s and it drops to 40.729s with my series and
> jemalloc. The total change in compile time is -25% if we incorporate
> both. Without jemalloc, the difference is only -14.7%.
>
> With radeonsi, the improvement is approx. slightly more than 1/2 of that
> (if you add the LLVM time). However, radeonsi also has asynchronous
> shader compilation hiding LLVM overhead in some cases, so it depends.
>
> Drivers with faster compiler backends will benefit more than radeonsi,
> but will probably not reach -25% or -14.7% (except softpipe, which uses
> TGSI as-is).
>
> The memory usage looks reasonable in all tested cases.
>
> Note: One of the first patches moves memset from ralloc to rzalloc.
> I tested and fixed the GLSL source -> TGSI path, but other codepaths
> may break, and you need to use valgrind to find all uninitialized
> variables that relied on ralloc doing memset (if there are any).
>
> You can also find it here:
> https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework
> <https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework>
>
> Please review.
>
> src/compiler/glsl/ast.h | 4 +-
> src/compiler/glsl/ast_to_hir.cpp | 4 +-
> src/compiler/glsl/ast_type.cpp | 13 ++-
> src/compiler/glsl/glcpp/glcpp-lex.l | 2 +-
> src/compiler/glsl/glcpp/glcpp-parse.y | 203
> +++++++++++++++++---------------------
> src/compiler/glsl/glcpp/glcpp.h | 1 +
> src/compiler/glsl/glsl_lexer.ll | 16 +--
> src/compiler/glsl/glsl_parser.yy | 202
> +++++++++++++++++++-------------------
> src/compiler/glsl/glsl_parser_extras.cpp | 6 +-
> src/compiler/glsl/glsl_parser_extras.h | 4 +-
> src/compiler/glsl/glsl_symbol_table.cpp | 19 ++--
> src/compiler/glsl/glsl_symbol_table.h | 1 +
> src/compiler/glsl/ir.cpp | 4 +
> src/compiler/glsl/ir.h | 13 ++-
> src/compiler/glsl/link_uniform_blocks.cpp | 2 +-
> src/compiler/glsl/list.h | 2 +-
> src/compiler/glsl/lower_packed_varyings.cpp | 8 +-
> src/compiler/glsl/opt_constant_propagation.cpp | 14 ++-
> src/compiler/glsl/opt_copy_propagation.cpp | 7 +-
> src/compiler/glsl/opt_copy_propagation_elements.cpp | 19 ++--
> src/compiler/glsl/opt_dead_code_local.cpp | 12 ++-
> src/compiler/glsl_types.cpp | 38 +------
> src/compiler/glsl_types.h | 6 +-
> src/compiler/nir/nir.c | 8 +-
> src/compiler/spirv/vtn_variables.c | 3 +-
> src/gallium/drivers/freedreno/ir3/ir3.c | 2 +-
> src/gallium/drivers/vc4/vc4_cl.c | 2 +-
> src/gallium/drivers/vc4/vc4_program.c | 2 +-
> src/gallium/drivers/vc4/vc4_simulator.c | 5 +-
> src/mesa/drivers/dri/i965/brw_state_batch.c | 5 +-
> src/util/ralloc.c | 392
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
> src/util/ralloc.h | 93
> ++++++++++++++++--
> 32 files changed, 782 insertions(+), 330 deletions(-)
>
> Marek
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org <mailto:mesa-dev at lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>
>
>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
More information about the mesa-dev
mailing list