[Mesa-dev] [PATCH v2 1/7] nv50/ir: enable PostRaConstantFolding for [c0, f0)

Ilia Mirkin imirkin at alum.mit.edu
Wed Jan 27 09:37:46 PST 2016


Please make this work for all chips. I don't want to have these
partial optimizations in place. The reason it was OK for nv50 is that
I thought this only ever applied to nv50, didn't realize that (a) we
didn't have FFMA32I not hooked up and (b) that it had this
restriction. Also you'd want to drop the NV50 bit of the PostRA stuff.
And come to think of it, it's not ConstantFolding at all, it's
LoadPropagation.

Here's a commit that implements FFMA32I on SM35:
https://github.com/imirkin/mesa/commit/28a76c1d5fcdc40f441a5151c05438f0779e2dae

It should probably also check that src2 == dst. I didn't realize that
had to be the case at the time I wrote it (and didn't push it because
I hadn't actually tested it yet).

You'll need to do something similar for SM50 (GM107). Use nvdisasm,
but note it's super picky for SM50 - it wants the sched code and all 3
instructions that follow, it won't do them one at a time, unlike
previous gens. However I think envydis knows about those forms, might
be easier to just read it from there.

  -ilia


On Wed, Jan 27, 2016 at 12:25 PM, Karol Herbst <nouveau at karolherbst.de> wrote:
> From: Karol Herbst <git at karolherbst.de>
>
> helps shaders in multiple games
>
> total instructions in shared programs : 1925865 -> 1922112 (-0.19%)
> total gprs used in shared programs    : 251863 -> 251863 (0.00%)
> total local used in shared programs   : 5673 -> 5673 (0.00%)
> total bytes used in shared programs   : 17657840 -> 17624080 (-0.19%)
>
>                 local        gpr       inst      bytes
>     helped           0           0        2082        2082
>       hurt           0           0           0           0
>
> v2: only Tesla needs the lower path
>
> Signed-off-by: Karol Herbst <nouveau at karolherbst.de>
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 95e9fdf..ced9904 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -2883,7 +2883,7 @@ NV50PostRaConstantFolding::visit(BasicBlock *bb)
>              def = def->getSrc(0)->getInsn();
>           if (def && def->op == OP_MOV && def->src(0).getFile() == FILE_IMMEDIATE) {
>              vtmp = i->getSrc(1);
> -            if (isFloatType(i->sType)) {
> +            if (typeSizeof(i->sType) >= 2) {
>                 i->setSrc(1, def->getSrc(0));
>              } else {
>                 ImmediateValue val;
> @@ -3325,7 +3325,7 @@ bool
>  Program::optimizePostRA(int level)
>  {
>     RUN_PASS(2, FlatteningPass, run);
> -   if (getTarget()->getChipset() < 0xc0)
> +   if (getTarget()->getChipset() < NVISA_GK20A_CHIPSET)
>        RUN_PASS(2, NV50PostRaConstantFolding, run);
>
>     return true;
> --
> 2.7.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list