[Mesa-dev] [PATCH 2/2] nv50/ir: add optimization for modulo by a non-power-of-2 value
Tobias Klausmann
tobias.johannes.klausmann at mni.thm.de
Sun Nov 12 14:13:44 UTC 2017
On 11/11/17 4:12 AM, Ilia Mirkin wrote:
> We can still use the optimized division methods which make use of
> multiplication with overflow.
>
> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
> ---
> src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index fabac662e7f..56a50320816 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -1188,6 +1188,21 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
>
> delete_Instruction(prog, i);
> }
> + } else if (s == 1) {
> + // In this case, we still want the optimized lowering that we get
> + // from having division by an immediate.
> + //
> + // a % b == a - (a/b) * b
> + bld.setPosition(i, false);
> + Value *div = bld.mkOp2v(OP_DIV, i->sType, bld.getSSA(),
> + i->getSrc(0), i->getSrc(1));
> + newi = bld.mkOp2(OP_ADD, i->sType, i->getDef(0), i->getSrc(0),
> + bld.mkOp2v(OP_MUL, i->sType, bld.getSSA(), div, i->getSrc(1)));
> + // TODO: Check that target supports this. In this case, we know that
> + // all backends do.
> + newi->src(1).mod = Modifier(NV50_IR_MOD_NEG);
> +
> + delete_Instruction(prog, i);
> }
> break;
>
lgtm,
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
More information about the mesa-dev
mailing list