[Mesa-dev] [PATCH 2/2] nv50/ir: add optimization for modulo by a non-power-of-2 value

Tobias Klausmann tobias.johannes.klausmann at mni.thm.de
Sun Nov 12 14:13:44 UTC 2017


On 11/11/17 4:12 AM, Ilia Mirkin wrote:
> We can still use the optimized division methods which make use of
> multiplication with overflow.
>
> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
> ---
>   src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index fabac662e7f..56a50320816 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -1188,6 +1188,21 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
>   
>               delete_Instruction(prog, i);
>            }
> +      } else if (s == 1) {
> +         // In this case, we still want the optimized lowering that we get
> +         // from having division by an immediate.
> +         //
> +         // a % b == a - (a/b) * b
> +         bld.setPosition(i, false);
> +         Value *div = bld.mkOp2v(OP_DIV, i->sType, bld.getSSA(),
> +                                 i->getSrc(0), i->getSrc(1));
> +         newi = bld.mkOp2(OP_ADD, i->sType, i->getDef(0), i->getSrc(0),
> +                          bld.mkOp2v(OP_MUL, i->sType, bld.getSSA(), div, i->getSrc(1)));
> +         // TODO: Check that target supports this. In this case, we know that
> +         // all backends do.
> +         newi->src(1).mod = Modifier(NV50_IR_MOD_NEG);
> +
> +         delete_Instruction(prog, i);
>         }
>         break;
>   


lgtm,

Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>




More information about the mesa-dev mailing list