<div dir="ltr">I also got myself some benchmark data! I wrote a python script to which generates a shader_test file with 5000 back-to-back random integer division operations. The compiler takes a long time to compile the shader but, running it with shader_time, I see about a 2x improvement with my pass.<br></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Sep 13, 2018 at 2:41 PM Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Shader-db results on Sky Lake:<br>
<br>
total instructions in shared programs: 15105795 -> 15111403 (0.04%)<br>
instructions in affected programs: 72774 -> 78382 (7.71%)<br>
helped: 0<br>
HURT: 265<br>
<br>
Note that hurt here actually means helped because we're getting rid of<br>
integer quotient operations (which are a send on some platforms!) and<br>
replacing them with fairly cheap ALU ops.<br>
---<br>
src/intel/compiler/brw_nir.c | 4 +++-<br>
1 file changed, 3 insertions(+), 1 deletion(-)<br>
<br>
diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c<br>
index b38c3ba383d..4de0a6c44d4 100644<br>
--- a/src/intel/compiler/brw_nir.c<br>
+++ b/src/intel/compiler/brw_nir.c<br>
@@ -569,6 +569,7 @@ brw_nir_optimize(nir_shader *nir, const struct brw_compiler *compiler,<br>
OPT(nir_opt_cse);<br>
OPT(nir_opt_peephole_select, 0);<br>
OPT(nir_opt_intrinsics);<br>
+ OPT(nir_opt_idiv_const, 0);<br>
OPT(nir_opt_algebraic);<br>
OPT(nir_opt_constant_folding);<br>
OPT(nir_opt_dead_cf);<br>
@@ -675,7 +676,8 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir)<br>
*/<br>
nir_lower_int64(nir, nir_lower_imul64 |<br>
nir_lower_isign64 |<br>
- nir_lower_divmod64);<br>
+ nir_lower_divmod64 |<br>
+ nir_lower_imul_high64);<br>
<br>
nir = brw_nir_optimize(nir, compiler, is_scalar, true);<br>
<br>
-- <br>
2.17.1<br>
<br>
</blockquote></div>