[Mesa-dev] [PATCH 29/59] i965/fs: add a pass for lowering PACK opcodes

Iago Toral itoral at igalia.com
Tue May 3 06:39:51 UTC 2016


On Mon, 2016-05-02 at 14:40 -0700, Kenneth Graunke wrote:
> On Monday, May 2, 2016 9:15:53 AM PDT Iago Toral wrote:
> > On Sat, 2016-04-30 at 00:50 -0700, Kenneth Graunke wrote:
> > > On Friday, April 29, 2016 1:29:26 PM PDT Samuel Iglesias Gonsálvez wrote:
> > > > From: Connor Abbott <cwabbott0 at gmail.com>
> > > > 
> > > > ---
> > > >  src/mesa/drivers/dri/i965/Makefile.sources      |  1 +
> > > >  src/mesa/drivers/dri/i965/brw_fs.cpp            |  5 +++
> > > >  src/mesa/drivers/dri/i965/brw_fs.h              |  1 +
> > > >  src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp | 59 ++++++++++++++++++
> ++++
> > > +++
> > > >  4 files changed, 66 insertions(+)
> > > >  create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/
> drivers/
> > > dri/i965/Makefile.sources
> > > > index 441d727..2b2a51d 100644
> > > > --- a/src/mesa/drivers/dri/i965/Makefile.sources
> > > > +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> > > > @@ -26,6 +26,7 @@ i965_compiler_FILES = \
> > > >  	brw_fs.h \
> > > >  	brw_fs_live_variables.cpp \
> > > >  	brw_fs_live_variables.h \
> > > > +	brw_fs_lower_pack.cpp \
> > > >  	brw_fs_nir.cpp \
> > > >  	brw_fs_reg_allocate.cpp \
> > > >  	brw_fs_register_coalesce.cpp \
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/
> dri/
> > > i965/brw_fs.cpp
> > > > index 3d6ee44..e9fd251 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > > @@ -5341,6 +5341,11 @@ fs_visitor::optimize()
> > > >        OPT(dead_code_eliminate);
> > > >     }
> > > >  
> > > > +   if (OPT(lower_pack)) {
> > > > +      OPT(register_coalesce);
> > > > +      OPT(dead_code_eliminate);
> > > > +   }
> > > > +
> > > >     OPT(opt_combine_constants);
> > > >     OPT(lower_integer_multiplication);
> > > >  
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/
> i965/
> > > brw_fs.h
> > > > index a5c3297..08f27e4 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_fs.h
> > > > +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> > > > @@ -174,6 +174,7 @@ public:
> > > >     void no16(const char *msg);
> > > >     void lower_uniform_pull_constant_loads();
> > > >     bool lower_load_payload();
> > > > +   bool lower_pack();
> > > >     bool lower_logical_sends();
> > > >     bool lower_integer_multiplication();
> > > >     bool lower_minmax();
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp b/src/mesa/
> > > drivers/dri/i965/brw_fs_lower_pack.cpp
> > > > new file mode 100644
> > > > index 0000000..39ed401
> > > > --- /dev/null
> > > > +++ b/src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp
> > > > @@ -0,0 +1,59 @@
> > > > +/*
> > > > + * Copyright © 2015 Connor Abbott
> > > > + *
> > > > + * Permission is hereby granted, free of charge, to any person 
> obtaining a
> > > > + * copy of this software and associated documentation files (the 
> > > "Software"),
> > > > + * to deal in the Software without restriction, including without 
> > > limitation
> > > > + * the rights to use, copy, modify, merge, publish, distribute, 
> sublicense,
> > > > + * and/or sell copies of the Software, and to permit persons to whom 
> the
> > > > + * Software is furnished to do so, subject to the following conditions:
> > > > + *
> > > > + * The above copyright notice and this permission notice (including the 
> > > next
> > > > + * paragraph) shall be included in all copies or substantial portions 
> of 
> > > the
> > > > + * Software.
> > > > + *
> > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> EXPRESS 
> > > OR
> > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> MERCHANTABILITY,
> > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
> SHALL
> > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > > OTHER
> > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> ARISING
> > > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> > > DEALINGS
> > > > + * IN THE SOFTWARE.
> > > > + */
> > > > +
> > > > +#include "brw_fs.h"
> > > > +#include "brw_cfg.h"
> > > > +#include "brw_fs_builder.h"
> > > > +
> > > > +using namespace brw;
> > > > +
> > > > +bool
> > > > +fs_visitor::lower_pack()
> > > > +{
> > > > +   bool progress = false;
> > > > +
> > > > +   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
> > > > +      if (inst->opcode != FS_OPCODE_PACK)
> > > > +         continue;
> > > > +
> > > > +      assert(inst->dst.file == VGRF);
> > > > +      assert(inst->saturate == false);
> > > > +      fs_reg dst = inst->dst;
> > > > +
> > > > +      const fs_builder ibld(this, block, inst);
> > > > +
> > > > +      for (unsigned i = 0; i < inst->sources; i++) {
> > > > +         ibld.MOV(stride(horiz_offset(retype(dst, inst->src[i].type), 
> i),
> > > > +                         inst->sources),
> > > 
> > > Will this work properly for UNIFORMs?  I'm not sure horiz_offset does
> > > what you want.  It says:
> > > 
> > >    case UNIFORM:
> > >    case IMM:
> > >       /* These only have a single component that is implicitly splatted.  
> A
> > >        * horizontal offset should be a harmless no-op.
> > >        */
> > >       break;
> > > 
> > > But it looks here like you're trying to use horiz_offset to access the
> > > first and second 32-bit components of a 64-bit double float.
> > 
> > Notice that the pack opcode uses horiz_offset to write to the result,
> > which can't be a uniform since those are read-only.
> 
> Oh, right - so it's fine here.  But it's used on a source in the
> implementation of fsign, and in is_copy_payload(), and in the various
> pack/unpack related patches (optimize unpack double, optimize pack
> double, translate double pack/unpack).  So we need a solution for those
> cases, at least.
> 
> > The next patch handles unpack opcodes and those could in theory be
> > affected. In practice though, I think this can't happen because the
> > implementation of nir_intrinsic_load_uniform will always write to a vgrf
> > and immediates should be handled by the either GLSL's or NIR's constant
> > expressions passes.
> 
> That could be.  Does something prevent UNIFORM file sources from being
> copy propagated back into these opcodes?  If so, we could just add an
> assert.

I think this should be fine. All these happen at NIR->FS time and if
they always execute with VGRF registers as I say above, horiz offset
should work at this time. If copy-propagation does try to replace the
vgrf access with a constant at a later time, then horiz_offset has
already done its work (it has updated reg_offset and/or subreg_offset on
the vgrf) and we just need to make sure that copy-propagation does the
right thing when it tries to replace that vgrf access with a uniform
access.

We have patched copy-propagation to fix a number of cases that are
specific to fp64 and I think for this particular case, this commit here
is relevant:

https://github.com/Igalia/mesa/commit/6b85b6b7369d955a79f25b9af52d2eb90c96c84a

This patch is in the pack with copy-propagation fixes that we wanted to
to send after this series, maybe it would have been better to send them
together, I held those back to try and reduce the size of this initial
series, sorry if that made things more difficult to review. We can send
them for review now if you think they'll make things easier.

> > 
> > We could add an assert to check that the argument to unpack2x32 is
> > always a VGRF.
> > 
> > Alternatively we could implement horiz_offset for uniforms as we do for
> > VGRF. As far as I see the places where we call this function at the
> > moment won't call it for uniforms so it should not change existing
> > behavior but I don't know if that is a better option. We could also try
> > to avoid horiz_offset completely in the implementation of unpack and
> > instead do the same thing manually.
> 
> Right, we might be able to just use byte_offset...not sure what the best
> approach is.




More information about the mesa-dev mailing list