[Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.
Roland Scheidegger
sroland at vmware.com
Mon Mar 10 19:21:32 PDT 2014
Am 11.03.2014 01:23, schrieb Ian Romanick:
> I had a pretty similar patch on the top of my pow-optimization branch.
> I also expand x**3 and x**4. I had hoped that would enable some cases
> to expand then merge to MADs. It should also be faster on older GENs
> where POW perf sucks. I didn't send it out because I wanted to add a
> similar optimization in the back end that would turn x*x*x*x back into
> x**4 on GPUs where the POW would be faster.
I have no idea what performance POW has on newer intel gpu hw (since in
contrast to older pre-snb hw with separate mathbox the manual doesn't
list throughput for extended math functions, at least I never found it),
but I find it highly unlikely that a POW has a cost lower than 2 muls
anywhere.
Roland
> I also didn't have anything in shader-db that benefitted from x**2 or
> x**3. It seems like there were a couple that would be modified by a
> x**5 flattening, but I think that would universally be slower....
>
> On 03/10/2014 03:54 PM, Matt Turner wrote:
>> Cuts two instructions out of SynMark's Gl32VSInstancing benchmark.
>> ---
>> src/glsl/opt_algebraic.cpp | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
>> index 5c49a78..8494bd9 100644
>> --- a/src/glsl/opt_algebraic.cpp
>> +++ b/src/glsl/opt_algebraic.cpp
>> @@ -528,6 +528,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
>> if (is_vec_two(op_const[0]))
>> return expr(ir_unop_exp2, ir->operands[1]);
>>
>> + if (is_vec_two(op_const[1])) {
>> + ir_variable *x = new(ir) ir_variable(ir->operands[1]->type, "x",
>> + ir_var_temporary);
>> + base_ir->insert_before(x);
>> + base_ir->insert_before(assign(x, ir->operands[0]));
>> + return mul(x, x);
>> + }
>> +
>> break;
>>
>> case ir_unop_rcp:
>>
>
More information about the mesa-dev
mailing list