[Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.

Mon Mar 10 17:23:08 PDT 2014

I had a pretty similar patch on the top of my pow-optimization branch.
I also expand x**3 and x**4.  I had hoped that would enable some cases
to expand then merge to MADs.  It should also be faster on older GENs
where POW perf sucks.  I didn't send it out because I wanted to add a
similar optimization in the back end that would turn x*x*x*x back into
x**4 on GPUs where the POW would be faster.

I also didn't have anything in shader-db that benefitted from x**2 or
x**3.  It seems like there were a couple that would be modified by a
x**5 flattening, but I think that would universally be slower....

On 03/10/2014 03:54 PM, Matt Turner wrote:
> Cuts two instructions out of SynMark's Gl32VSInstancing benchmark.
> ---
>  src/glsl/opt_algebraic.cpp | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
> index 5c49a78..8494bd9 100644
> --- a/src/glsl/opt_algebraic.cpp
> +++ b/src/glsl/opt_algebraic.cpp
> @@ -528,6 +528,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
>        if (is_vec_two(op_const[0]))
>           return expr(ir_unop_exp2, ir->operands[1]);
>  
> +      if (is_vec_two(op_const[1])) {
> +         ir_variable *x = new(ir) ir_variable(ir->operands[1]->type, "x",
> +                                              ir_var_temporary);
> +         base_ir->insert_before(x);
> +         base_ir->insert_before(assign(x, ir->operands[0]));
> +         return mul(x, x);
> +      }
> +
>        break;
>  
>     case ir_unop_rcp:
>