[Mesa-dev] [PATCH 29/59] i965/fs: add a pass for lowering PACK opcodes
Kenneth Graunke
kenneth at whitecape.org
Mon May 2 21:40:15 UTC 2016
On Monday, May 2, 2016 9:15:53 AM PDT Iago Toral wrote:
> On Sat, 2016-04-30 at 00:50 -0700, Kenneth Graunke wrote:
> > On Friday, April 29, 2016 1:29:26 PM PDT Samuel Iglesias Gonsálvez wrote:
> > > From: Connor Abbott <cwabbott0 at gmail.com>
> > >
> > > ---
> > > src/mesa/drivers/dri/i965/Makefile.sources | 1 +
> > > src/mesa/drivers/dri/i965/brw_fs.cpp | 5 +++
> > > src/mesa/drivers/dri/i965/brw_fs.h | 1 +
> > > src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp | 59 ++++++++++++++++++
++++
> > +++
> > > 4 files changed, 66 insertions(+)
> > > create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/
drivers/
> > dri/i965/Makefile.sources
> > > index 441d727..2b2a51d 100644
> > > --- a/src/mesa/drivers/dri/i965/Makefile.sources
> > > +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> > > @@ -26,6 +26,7 @@ i965_compiler_FILES = \
> > > brw_fs.h \
> > > brw_fs_live_variables.cpp \
> > > brw_fs_live_variables.h \
> > > + brw_fs_lower_pack.cpp \
> > > brw_fs_nir.cpp \
> > > brw_fs_reg_allocate.cpp \
> > > brw_fs_register_coalesce.cpp \
> > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/
dri/
> > i965/brw_fs.cpp
> > > index 3d6ee44..e9fd251 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > @@ -5341,6 +5341,11 @@ fs_visitor::optimize()
> > > OPT(dead_code_eliminate);
> > > }
> > >
> > > + if (OPT(lower_pack)) {
> > > + OPT(register_coalesce);
> > > + OPT(dead_code_eliminate);
> > > + }
> > > +
> > > OPT(opt_combine_constants);
> > > OPT(lower_integer_multiplication);
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/
i965/
> > brw_fs.h
> > > index a5c3297..08f27e4 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_fs.h
> > > +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> > > @@ -174,6 +174,7 @@ public:
> > > void no16(const char *msg);
> > > void lower_uniform_pull_constant_loads();
> > > bool lower_load_payload();
> > > + bool lower_pack();
> > > bool lower_logical_sends();
> > > bool lower_integer_multiplication();
> > > bool lower_minmax();
> > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp b/src/mesa/
> > drivers/dri/i965/brw_fs_lower_pack.cpp
> > > new file mode 100644
> > > index 0000000..39ed401
> > > --- /dev/null
> > > +++ b/src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp
> > > @@ -0,0 +1,59 @@
> > > +/*
> > > + * Copyright © 2015 Connor Abbott
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person
obtaining a
> > > + * copy of this software and associated documentation files (the
> > "Software"),
> > > + * to deal in the Software without restriction, including without
> > limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to whom
the
> > > + * Software is furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice (including the
> > next
> > > + * paragraph) shall be included in all copies or substantial portions
of
> > the
> > > + * Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS
> > OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> > OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING
> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > DEALINGS
> > > + * IN THE SOFTWARE.
> > > + */
> > > +
> > > +#include "brw_fs.h"
> > > +#include "brw_cfg.h"
> > > +#include "brw_fs_builder.h"
> > > +
> > > +using namespace brw;
> > > +
> > > +bool
> > > +fs_visitor::lower_pack()
> > > +{
> > > + bool progress = false;
> > > +
> > > + foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
> > > + if (inst->opcode != FS_OPCODE_PACK)
> > > + continue;
> > > +
> > > + assert(inst->dst.file == VGRF);
> > > + assert(inst->saturate == false);
> > > + fs_reg dst = inst->dst;
> > > +
> > > + const fs_builder ibld(this, block, inst);
> > > +
> > > + for (unsigned i = 0; i < inst->sources; i++) {
> > > + ibld.MOV(stride(horiz_offset(retype(dst, inst->src[i].type),
i),
> > > + inst->sources),
> >
> > Will this work properly for UNIFORMs? I'm not sure horiz_offset does
> > what you want. It says:
> >
> > case UNIFORM:
> > case IMM:
> > /* These only have a single component that is implicitly splatted.
A
> > * horizontal offset should be a harmless no-op.
> > */
> > break;
> >
> > But it looks here like you're trying to use horiz_offset to access the
> > first and second 32-bit components of a 64-bit double float.
>
> Notice that the pack opcode uses horiz_offset to write to the result,
> which can't be a uniform since those are read-only.
Oh, right - so it's fine here. But it's used on a source in the
implementation of fsign, and in is_copy_payload(), and in the various
pack/unpack related patches (optimize unpack double, optimize pack
double, translate double pack/unpack). So we need a solution for those
cases, at least.
> The next patch handles unpack opcodes and those could in theory be
> affected. In practice though, I think this can't happen because the
> implementation of nir_intrinsic_load_uniform will always write to a vgrf
> and immediates should be handled by the either GLSL's or NIR's constant
> expressions passes.
That could be. Does something prevent UNIFORM file sources from being
copy propagated back into these opcodes? If so, we could just add an
assert.
>
> We could add an assert to check that the argument to unpack2x32 is
> always a VGRF.
>
> Alternatively we could implement horiz_offset for uniforms as we do for
> VGRF. As far as I see the places where we call this function at the
> moment won't call it for uniforms so it should not change existing
> behavior but I don't know if that is a better option. We could also try
> to avoid horiz_offset completely in the implementation of unpack and
> instead do the same thing manually.
Right, we might be able to just use byte_offset...not sure what the best
approach is.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160502/78204a80/attachment.sig>
More information about the mesa-dev
mailing list