[Nouveau] [PATCH v2] nvc0/ir: propagate immediates to CALL input MOVs
Tobias Klausmann
tobias.johannes.klausmann at mni.thm.de
Wed Aug 16 20:16:14 UTC 2017
ping on this v2
On 8/13/17 3:02 AM, Tobias Klausmann wrote:
> On using builtin functions we have to move the input to registers $0 and $1, if
> one of the input value is an immediate, we fail to propagate the immediate:
>
> ...
> mov u32 $r477 0x00000003 (0)
> ...
> mov u32 $r0 %r473 (0)
> mov u32 $r1 $r477 (0)
> call abs BUILTIN:0 (0)
> mov u32 %r495 $r1 (0)
> ...
>
> With this patch the immediate is propagated, potentially causing the first MOV
> to be superfluous, which we'd remove in that case:
>
> ...
>
> mov u32 $r0 %r473 (0)
> mov u32 $r1 0x00000003 (0)
> call abs BUILTIN:0 (0)
> mov u32 %r495 $r1 (0)
> ...
>
> Shaderdb stats:
> total instructions in shared programs : 4893460 -> 4893324 (-0.00%)
> total gprs used in shared programs : 582972 -> 582881 (-0.02%)
> total local used in shared programs : 17960 -> 17960 (0.00%)
>
> local gpr inst bytes
> helped 0 91 112 112
> hurt 0 0 0 0
>
> v2:
> implement some changes proposed by imirkin, the manual deletion of the dead
> mov is necessary after ea22ac23e0 ("nvc0/ir: unlink values pre- and post-call
> to division function") as the potentially dead mov is unlinked properly,
> causing later passes to not notice the mov op at all and thus not cleaning it
> up. That makes up a big chunk of the regression the above commit caused.
> Keep the deletion of the op where it is, deleting it later unnecessarily blows
> up size of the change.
>
> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
> ---
> .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 21 +++++++++++++++++++--
> 1 file changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> index c8f0701572..7243b1d2e4 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> @@ -47,8 +47,25 @@ NVC0LegalizeSSA::handleDIV(Instruction *i)
> int builtin;
>
> bld.setPosition(i, false);
> - bld.mkMovToReg(0, i->getSrc(0));
> - bld.mkMovToReg(1, i->getSrc(1));
> +
> + // Generate movs to the input regs for the call we want to generate
> + for (int s = 0; i->srcExists(s); ++s) {
> + Instruction *ld = i->getSrc(s)->getInsn();
> + assert(ld->getSrc(0) != NULL);
> + // check if we are moving an immediate, propagate it in that case
> + if (!ld || ld->fixed || (ld->op != OP_LOAD && ld->op != OP_MOV) ||
> + !(ld->src(0).getFile() == FILE_IMMEDIATE))
> + bld.mkMovToReg(s, i->getSrc(s));
> + else {
> + bld.mkMovToReg(s, ld->getSrc(0));
> + // Clear the src, to make code elimination possible here before we
> + // delete the instruction i later
> + i->setSrc(s, NULL);
> + if (ld->isDead())
> + delete_Instruction(prog, ld);
> + }
> + }
> +
> switch (i->dType) {
> case TYPE_U32: builtin = NVC0_BUILTIN_DIV_U32; break;
> case TYPE_S32: builtin = NVC0_BUILTIN_DIV_S32; break;
More information about the Nouveau
mailing list