[Mesa-dev] [PATCH 07/12] nir: Allow [iu]mul_high on non-32-bit types

Tue Oct 9 17:17:34 UTC 2018

On Tue, Oct 9, 2018 at 11:32 AM Ian Romanick <idr at freedesktop.org> wrote:

> On 10/08/2018 02:14 PM, Jason Ekstrand wrote:
> > On Mon, Oct 8, 2018 at 3:46 PM Ian Romanick <idr at freedesktop.org
> > <mailto:idr at freedesktop.org>> wrote:
> >
> >     On 10/05/2018 09:10 PM, Jason Ekstrand wrote:
> >     > ---
> >     >  src/compiler/nir/nir_constant_expressions.py |  1 +
> >     >  src/compiler/nir/nir_opcodes.py              | 43
> >     ++++++++++++++++++--
> >     >  2 files changed, 40 insertions(+), 4 deletions(-)
> >     >
> >     > diff --git a/src/compiler/nir/nir_constant_expressions.py
> >     b/src/compiler/nir/nir_constant_expressions.py
> >     > index 118af9f7818..afc0739e8b2 100644
> >     > --- a/src/compiler/nir/nir_constant_expressions.py
> >     > +++ b/src/compiler/nir/nir_constant_expressions.py
> >     > @@ -79,6 +79,7 @@ template = """\
> >     >  #include <math.h>
> >     >  #include "util/rounding.h" /* for _mesa_roundeven */
> >     >  #include "util/half_float.h"
> >     > +#include "util/bigmath.h"
> >     >  #include "nir_constant_expressions.h"
> >     >
> >     >  /**
> >     > diff --git a/src/compiler/nir/nir_opcodes.py
> >     b/src/compiler/nir/nir_opcodes.py
> >     > index 4ef4ecc6f22..209f0c5509b 100644
> >     > --- a/src/compiler/nir/nir_opcodes.py
> >     > +++ b/src/compiler/nir/nir_opcodes.py
> >     > @@ -443,12 +443,47 @@ binop("isub", tint, "", "src0 - src1")
> >     >  binop("fmul", tfloat, commutative + associative, "src0 * src1")
> >     >  # low 32-bits of signed/unsigned integer multiply
> >     >  binop("imul", tint, commutative + associative, "src0 * src1")
> >     > +
> >     >  # high 32-bits of signed integer multiply
> >     > -binop("imul_high", tint32, commutative,
> >     > -      "(int32_t)(((int64_t) src0 * (int64_t) src1) >> 32)")
> >     > +binop("imul_high", tint, commutative, """
> >
> >     This will enable imul_high for all integer types (ditto for umul_high
> >     below).  A later patch adds lowering for 64-bit integer type.  Will
> the
> >     backend do the right thing for [iu]mul_high of 16- or 8-bit types?
> >
> >
> > That's a good question.  Looks like lower_integer_multiplication in the
> > back-end will do nothing whatsoever, and we'll emit an illegal opcode
> > which will probably hang the GPU.  For 8 and 16, it's easy enough to
> > lower to a couple of conversions, a N*2-bit multiply, and a shift.  It's
> > also not obvious where the cut-off point for the optimization is.
> > Certainly, it's better in 64-bits than doing the division algorithm in
> > the shader and I think it's better for 32 but maybe not in 8 and 16?
> > I'm not sure.  I'm pretty sure my 32-bit benchmark gave positive results
> > (about 40-50% faster) but it was very noisy.
> >
> > I don't think anything allows 8 and 16-bit arithmetic right now.  Still,
> > should probably fix it...
>
> Hm... if an extension adds GL or Vulkan support for 16-bit arithmetic, I
> doubt it would add [iu]mul_high (e.g., GL_AMD_gpu_shader_int16).  I'd
> expect every GPU would support a 16x16=32 multiplier.  Would it be
> better to restrict this instruction to 32- and 64-bit and implement the
> integer division optimization differently for 8- and 16-bit sources?
>

We don't currently have a mechanism in NIR to restrict an instruction to a
particular set of bit-widths.  Maybe we should consider adding that?

As far as the optimization goes, I see a few different options;

 1) Add a few lines in opt_algebraic to do the lowering.  We could even
make it unconditional for now.
 2) Add a couple tiny helpers that switch on bit-width in opt_idiv_const
and do the lowering on-the-fly.
 3) Just disable the pass for 8 and 16-bit for now.  It's questionable
whether it's actually worth emitting 4-6 instructions for a simple division
at those bit sizes anyway.

I think my order of preference is 3 > 1 > 2.  Especially when you consider
that we have no good way to test the pass for anything other than 32 and
64-bit right now.

--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181009/9b169b6e/attachment.html>