[Nouveau] [PATCH 3/3] nv50/ir: Fold IMM into MAD

Ilia Mirkin imirkin at alum.mit.edu
Sat Jan 10 16:39:56 PST 2015


On Sat, Jan 10, 2015 at 7:23 PM, Roy Spliet <rspliet at eclipso.eu> wrote:
> Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is
> a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be
> done post-RA because it is required that SDST == SSRC2.

"because it requires that"

>
> Signed-off-by: Roy Spliet <rspliet at eclipso.eu>
> ---
>  .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 52 ++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 21d20ca..1fc3ae6 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -2259,6 +2259,56 @@ FlatteningPass::tryPredicateConditional(BasicBlock *bb)
>
>  // =============================================================================
>
> +// Fold Immediate into MAD; must be done after register allocation due to
> +// constraint SDST == SSRC2
> +// TODO:
> +// Does NVC0+ have other situations where this pass makes sense?
> +class NV50PostRaConstantFolding : public Pass
> +{
> +private:
> +   virtual bool visit(BasicBlock *);
> +};
> +
> +bool
> +NV50PostRaConstantFolding::visit(BasicBlock *bb)
> +{
> +   Value *vtmp;
> +   Instruction *def;
> +
> +   for (Instruction *i = bb->getFirst(); i; i = i->next) {
> +      switch (i->op) {
> +      case OP_MAD:
> +         if(i->def(0).getFile() == FILE_GPR &&
> +               i->src(0).getFile() == FILE_GPR &&
> +               i->src(1).getFile() == FILE_GPR &&
> +               i->src(2).getFile() == FILE_GPR &&
> +               i->getDef(0)->reg.data.id == i->getSrc(2)->reg.data.id) {


This would be much easier to read as

if (... != GPR || != GPR || ...) break; (or continue...)

> +            for (int s = 1; s >= 0; s--) {

You don't end up using 's' in the loop. Did you mean to have some
clever logic that flips the order of src0 and src1 in case the "wrong"
one came from an immediate?

> +               def = i->getSrc(1)->getInsn();
> +               if (def->op == OP_MOV && def->src(0).getFile() == FILE_IMMEDIATE) {
> +                  vtmp = i->getSrc(1);
> +                  i->setSrc(1, def->getSrc(0));
> +                  if (vtmp->refCount() == 0)
> +                     delete_Instruction(bb->getProgram(), def);

This shouldn't be necessary, it's all allocated in an arena and will
get cleaned up later.

> +                  break;
> +               }
> +
> +               vtmp = i->getSrc(0);
> +               i->setSrc(0, i->getSrc(1));
> +               i->setSrc(1, vtmp);
> +            }
> +         }
> +         break;
> +      default:
> +         break;
> +      }
> +   }
> +
> +   return true;
> +}
> +
> +// =============================================================================
> +
>  // Common subexpression elimination. Stupid O^2 implementation.
>  class LocalCSE : public Pass
>  {
> @@ -2629,6 +2679,8 @@ bool
>  Program::optimizePostRA(int level)
>  {
>     RUN_PASS(2, FlatteningPass, run);
> +   if (getTarget()->getChipset() < 0xc0)
> +      RUN_PASS(2, NV50PostRaConstantFolding, run);
>     return true;
>  }
>
> --
> 2.1.0
>
>
>
> _______________________________________________
> Nouveau mailing list
> Nouveau at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau


More information about the Nouveau mailing list