[Mesa-dev] [PATCH] glsl: add lowering for double divide to rcp/mul

Fri Feb 20 06:21:28 PST 2015

Am 19.02.2015 um 23:47 schrieb Dave Airlie:
> From: Dave Airlie <airlied at redhat.com>
> 
> It looks like no hw does div anyways, so we should just
> lower at the GLSL level.
> 
> Signed-off-by: Dave Airlie <airlied at redhat.com>
> ---
>  src/glsl/lower_instructions.cpp | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp
> index e8a69e7..ac6715b 100644
> --- a/src/glsl/lower_instructions.cpp
> +++ b/src/glsl/lower_instructions.cpp
> @@ -199,7 +199,7 @@ lower_instructions_visitor::sub_to_add_neg(ir_expression *ir)
>  void
>  lower_instructions_visitor::div_to_mul_rcp(ir_expression *ir)
>  {
> -   assert(ir->operands[1]->type->is_float());
> +   assert(ir->operands[1]->type->is_float() || ir->operands[1]->type->is_double());
>  
>     /* New expression for the 1.0 / op1 */
>     ir_rvalue *expr;
> @@ -327,7 +327,7 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir)
>     /* Don't generate new IR that would need to be lowered in an additional
>      * pass.
>      */
> -   if (lowering(DIV_TO_MUL_RCP) && ir->type->is_float())
> +   if (lowering(DIV_TO_MUL_RCP) && (ir->type->is_float() || ir->type->is_double()))
>        div_to_mul_rcp(div_expr);
>  
>     ir_expression *const floor_expr =
> @@ -1014,7 +1014,7 @@ lower_instructions_visitor::visit_leave(ir_expression *ir)
>     case ir_binop_div:
>        if (ir->operands[1]->type->is_integer() && lowering(INT_DIV_TO_MUL_RCP))
>  	 int_div_to_mul_rcp(ir);
> -      else if (ir->operands[1]->type->is_float() && lowering(DIV_TO_MUL_RCP))
> +      else if ((ir->operands[1]->type->is_float() || ir->operands[1]->type->is_double())&& lowering(DIV_TO_MUL_RCP))
>  	 div_to_mul_rcp(ir);
>        break;
>  
> 

FWIW I suspect lowering ddiv doesn't really cut it always. Can be done
later though. (d3d11 in fact has ddiv as an "extended double feature"
not just the ordinary double feature, though along with dfma and drcp,
which makes me wonder how you'd do a division with the unextended
version if you don't even have rcp... In any case ddiv is required to be
precise to 0.5 ULP if supported, drcp only 1.0 ULP.)

Roland