[Mesa-dev] [PATCH 4/4] nir: add ARB_shader_ballot and ARB_shader_group_vote instructions

Mon Jun 5 21:43:06 UTC 2017

On Mon, Jun 5, 2017 at 1:50 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:

> On Mon, Jun 5, 2017 at 1:37 PM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> > I'm not sure how I feel about having these as ALU operations.  ALU
> > operations are generally pure functions (with the exception derivative)
> that
> > can be re-ordered at will.  I don't really like breaking that.  In fact,
> I'd
> > almost be inclined to make derivatives intrinsics and just special-case
> them
> > in constant folding.  Thoughts?
>
> I wasn't too sure about this either. It is a little weird to make
> these ALU instructions. I followed the rule here that if something can
> be constant-folded, it should be an ALU instruction, but I guess you
> can argue that it's just a coincidence that these can be
> constant-folded anyways.

Yeah.  As subgroup ops get more complicated, I think a log of the subgroup
operations can be constant-folded after a fashion but the rules get weird
fast.

> I guess the main downside is that it would be
> impossible to make nir_algebraic patterns with these, although I can't
> think of too many simple pattern-matching type things you'd want to do
> on these instructions anyways.

Yeah.  My gut also tells me that shaders which are "advanced" enough to use
subgroup features probably don't need (or it can't be done) the massive
reductions we do for D3D9-generated shaders.

> Maybe something like not(any(not(foo)))
> -> all(foo) and vice-versa?
>
> >
> > On Mon, Jun 5, 2017 at 12:22 PM, Connor Abbott <cwabbott0 at gmail.com>
> wrote:
> >>
> >> Signed-off-by: Connor Abbott <cwabbott0 at gmail.com>
> >> ---
> >>  src/compiler/nir/nir_intrinsics.h | 14 ++++++++++++++
> >>  src/compiler/nir/nir_opcodes.py   | 18 ++++++++++++++++--
> >>  2 files changed, 30 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/src/compiler/nir/nir_intrinsics.h
> >> b/src/compiler/nir/nir_intrinsics.h
> >> index 21e7d90..157df7f 100644
> >> --- a/src/compiler/nir/nir_intrinsics.h
> >> +++ b/src/compiler/nir/nir_intrinsics.h
> >> @@ -330,6 +330,20 @@ SYSTEM_VALUE(channel_num, 1, 0, xx, xx, xx)
> >>  SYSTEM_VALUE(alpha_ref_float, 1, 0, xx, xx, xx)
> >>  SYSTEM_VALUE(layer_id, 1, 0, xx, xx, xx)
> >>  SYSTEM_VALUE(view_index, 1, 0, xx, xx, xx)
> >> +SYSTEM_VALUE(subgroup_invocation, 1, 0, xx, xx, xx)
> >> +
> >> +
> >> +/* ARB_shader_ballot instructions */
> >> +
> >> +SYSTEM_VALUE(subgroup_eq_mask, 1, 0, xx, xx, xx)
> >> +SYSTEM_VALUE(subgroup_ge_mask, 1, 0, xx, xx, xx)
> >> +SYSTEM_VALUE(subgroup_gt_mask, 1, 0, xx, xx, xx)
> >> +SYSTEM_VALUE(subgroup_le_mask, 1, 0, xx, xx, xx)
> >> +SYSTEM_VALUE(subgroup_lt_mask, 1, 0, xx, xx, xx)
> >> +
> >> +INTRINSIC(ballot, 1, ARR(0), true, 0, 0, 0, xx, xx, xx,
> >> +          NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER |
> >> +          NIR_INTRINSIC_CROSS_THREAD)
> >>
> >>  /* Blend constant color values.  Float values are clamped. */
> >>  SYSTEM_VALUE(blend_const_color_r_float, 1, 0, xx, xx, xx)
> >> diff --git a/src/compiler/nir/nir_opcodes.py
> >> b/src/compiler/nir/nir_opcodes.py
> >> index be3ab6d..05a80b2 100644
> >> --- a/src/compiler/nir/nir_opcodes.py
> >> +++ b/src/compiler/nir/nir_opcodes.py
> >> @@ -120,8 +120,10 @@ def opcode(name, output_size, output_type,
> >> input_sizes, input_types,
> >>                            input_types, convergent, cross_thread,
> >>                            algebraic_properties, const_expr)
> >>
> >> -def unop_convert(name, out_type, in_type, const_expr):
> >> -   opcode(name, 0, out_type, [0], [in_type], "", const_expr)
> >> +def unop_convert(name, out_type, in_type, const_expr,
> cross_thread=False,
> >> +                 convergent=False):
> >> +   opcode(name, 0, out_type, [0], [in_type], "", const_expr,
> convergent,
> >> +          cross_thread)
> >>
> >>  def unop(name, ty, const_expr, convergent=False, cross_thread=False):
> >>     opcode(name, 0, ty, [0], [ty], "", const_expr, convergent,
> >> cross_thread)
> >> @@ -355,6 +357,18 @@ for i in xrange(1, 5):
> >>     for j in xrange(1, 5):
> >>        unop_horiz("fnoise{0}_{1}".format(i, j), i, tfloat, j, tfloat,
> >> "0.0f")
> >>
> >> +# ARB_shader_ballot instructions
> >> +
> >> +opcode("read_invocation", 0, tuint, [0, 1], [tuint, tuint32], "",
> "src0",
> >> +        cross_thread=True)
> >> +unop("read_first_invocation", tuint, "src0", cross_thread=True)
> >> +
> >> +# ARB_shader_group_vote instructions
> >> +
> >> +unop("any_invocations", tbool, "src0", cross_thread=True)
> >> +unop("all_invocations", tbool, "src0", cross_thread=True)
> >> +unop("all_invocations_equal", tbool, "true", cross_thread=True)
> >> +
> >>  def binop_convert(name, out_type, in_type, alg_props, const_expr):
> >>     opcode(name, 0, out_type, [0, 0], [in_type, in_type], alg_props,
> >> const_expr)
> >>
> >> --
> >> 2.9.3
> >>
> >> _______________________________________________
> >> mesa-dev mailing list
> >> mesa-dev at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170605/727f9ef1/attachment-0001.html>