[Mesa-dev] [PATCH 0/2] nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const
Eduardo Lima Mitev
elima at igalia.com
Thu Oct 22 07:12:57 PDT 2015
Last month I was working on an optimization for nir_opt_peephole_ffma, and I sent a request for comments to the list . Then I went on holidays and this work got stalled.
During the last days I resumed it, and have been experimenting with some ideas Matt Turner suggested back then, in order to refine and improve the patch.
However, this is still work in progress. But since I'm taking over a new task that should expand for a few weeks, I'm formally sending the patch now for review (patch 2/2), for two reasons: a) shader-db still shows good numbers regardless the number of HURTs, and b) Jason felt favorable to merging it back then.
Also, as suggested by Jason, I'm moving the whole pass to the i965 driver because my changes are i965-specific, and we are the only consumers anyway. This is what patch 1/2 does.
Shader-db results for i965 on Haswell:
total instructions in shared programs: 6240407 -> 6230467 (-0.16%)
instructions in affected programs: 1126478 -> 1116538 (-0.88%)
total loops in shared programs: 1979 -> 1979 (0.00%)
Eduardo Lima Mitev (2):
nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver
i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and
fmul is a const
src/glsl/Makefile.sources | 1 -
src/glsl/nir/nir.h | 1 -
src/glsl/nir/nir_opt_peephole_ffma.c | 268 ------------------
src/mesa/drivers/dri/i965/Makefile.sources | 1 +
src/mesa/drivers/dri/i965/brw_nir.c | 2 +-
src/mesa/drivers/dri/i965/brw_nir.h | 2 +
.../drivers/dri/i965/brw_nir_opt_peephole_ffma.c | 299 +++++++++++++++++++++
7 files changed, 303 insertions(+), 271 deletions(-)
delete mode 100644 src/glsl/nir/nir_opt_peephole_ffma.c
create mode 100644 src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
More information about the mesa-dev