<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Jun 5, 2017 at 6:37 PM, Connor Abbott <span dir="ltr"><<a href="mailto:cwabbott0@gmail.com" target="_blank">cwabbott0@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I pushed a v2 at<br>
<a href="https://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-divergence-v2" rel="noreferrer" target="_blank">https://cgit.freedesktop.org/~<wbr>cwabbott0/mesa/log/?h=nir-<wbr>divergence-v2</a>.<br>
I'm not sure if I like this version better, though. I'll have to think<br>
about it. In the meantime, feel free to take a look.<br><div class="HOEnZb"><div class="h5"></div></div></blockquote><div><br></div><div>I've taken a skim through the branch and I agree that I'm not sure either. Here's a few thoughts in no particular order:<br><br></div><div> 1) Other than the fact that it's a pile of churn, it doesn't seem to make too much difference whether dFdx and dFdy are ALU or intrinsics<br><br></div><div> 2) Convergent instructions are, in a lot of ways, easier to deal with than plain cross-thread ones. Convergent ops can always be moved up the dominance tree or down into uniform control-flow. Regular cross-thread instructions can't be moved across any non-uniform control-flow.<br><br></div><div> 3) dFdx and dFdy are weird because they're convergent so it's clear they are special but not clear they should be intrinsics instead of ALU<br><br></div><div> 4) I like the nir_instr_is_convergent() and nir_instr_is_cross_thread() helpers<br><br></div><div> 5) non-convergent cross-thread instructions should definitely be intrinsics.<br><br></div><div> 6) I think the shader ballot stuff is all non-convergent cross-thread as are some of the more advanced subgroup operations (see HLSL shader model 6.0).<br><br></div><div>That's all for now,<br><br></div><div>--Jason<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">
On Mon, Jun 5, 2017 at 2:43 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br>
> On Mon, Jun 5, 2017 at 1:50 PM, Connor Abbott <<a href="mailto:cwabbott0@gmail.com">cwabbott0@gmail.com</a>> wrote:<br>
>><br>
>> On Mon, Jun 5, 2017 at 1:37 PM, Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br>
>> wrote:<br>
>> > I'm not sure how I feel about having these as ALU operations. ALU<br>
>> > operations are generally pure functions (with the exception derivative)<br>
>> > that<br>
>> > can be re-ordered at will. I don't really like breaking that. In fact,<br>
>> > I'd<br>
>> > almost be inclined to make derivatives intrinsics and just special-case<br>
>> > them<br>
>> > in constant folding. Thoughts?<br>
>><br>
>> I wasn't too sure about this either. It is a little weird to make<br>
>> these ALU instructions. I followed the rule here that if something can<br>
>> be constant-folded, it should be an ALU instruction, but I guess you<br>
>> can argue that it's just a coincidence that these can be<br>
>> constant-folded anyways.<br>
><br>
><br>
> Yeah. As subgroup ops get more complicated, I think a log of the subgroup<br>
> operations can be constant-folded after a fashion but the rules get weird<br>
> fast.<br>
><br>
>><br>
>> I guess the main downside is that it would be<br>
>> impossible to make nir_algebraic patterns with these, although I can't<br>
>> think of too many simple pattern-matching type things you'd want to do<br>
>> on these instructions anyways.<br>
><br>
><br>
> Yeah. My gut also tells me that shaders which are "advanced" enough to use<br>
> subgroup features probably don't need (or it can't be done) the massive<br>
> reductions we do for D3D9-generated shaders.<br>
><br>
>><br>
>> Maybe something like not(any(not(foo)))<br>
>> -> all(foo) and vice-versa?<br>
>><br>
>> ><br>
>> > On Mon, Jun 5, 2017 at 12:22 PM, Connor Abbott <<a href="mailto:cwabbott0@gmail.com">cwabbott0@gmail.com</a>><br>
>> > wrote:<br>
>> >><br>
>> >> Signed-off-by: Connor Abbott <<a href="mailto:cwabbott0@gmail.com">cwabbott0@gmail.com</a>><br>
>> >> ---<br>
>> >> src/compiler/nir/nir_<wbr>intrinsics.h | 14 ++++++++++++++<br>
>> >> src/compiler/nir/nir_opcodes.<wbr>py | 18 ++++++++++++++++--<br>
>> >> 2 files changed, 30 insertions(+), 2 deletions(-)<br>
>> >><br>
>> >> diff --git a/src/compiler/nir/nir_<wbr>intrinsics.h<br>
>> >> b/src/compiler/nir/nir_<wbr>intrinsics.h<br>
>> >> index 21e7d90..157df7f 100644<br>
>> >> --- a/src/compiler/nir/nir_<wbr>intrinsics.h<br>
>> >> +++ b/src/compiler/nir/nir_<wbr>intrinsics.h<br>
>> >> @@ -330,6 +330,20 @@ SYSTEM_VALUE(channel_num, 1, 0, xx, xx, xx)<br>
>> >> SYSTEM_VALUE(alpha_ref_float, 1, 0, xx, xx, xx)<br>
>> >> SYSTEM_VALUE(layer_id, 1, 0, xx, xx, xx)<br>
>> >> SYSTEM_VALUE(view_index, 1, 0, xx, xx, xx)<br>
>> >> +SYSTEM_VALUE(subgroup_<wbr>invocation, 1, 0, xx, xx, xx)<br>
>> >> +<br>
>> >> +<br>
>> >> +/* ARB_shader_ballot instructions */<br>
>> >> +<br>
>> >> +SYSTEM_VALUE(subgroup_eq_<wbr>mask, 1, 0, xx, xx, xx)<br>
>> >> +SYSTEM_VALUE(subgroup_ge_<wbr>mask, 1, 0, xx, xx, xx)<br>
>> >> +SYSTEM_VALUE(subgroup_gt_<wbr>mask, 1, 0, xx, xx, xx)<br>
>> >> +SYSTEM_VALUE(subgroup_le_<wbr>mask, 1, 0, xx, xx, xx)<br>
>> >> +SYSTEM_VALUE(subgroup_lt_<wbr>mask, 1, 0, xx, xx, xx)<br>
>> >> +<br>
>> >> +INTRINSIC(ballot, 1, ARR(0), true, 0, 0, 0, xx, xx, xx,<br>
>> >> + NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER |<br>
>> >> + NIR_INTRINSIC_CROSS_THREAD)<br>
>> >><br>
>> >> /* Blend constant color values. Float values are clamped. */<br>
>> >> SYSTEM_VALUE(blend_const_<wbr>color_r_float, 1, 0, xx, xx, xx)<br>
>> >> diff --git a/src/compiler/nir/nir_<wbr>opcodes.py<br>
>> >> b/src/compiler/nir/nir_<wbr>opcodes.py<br>
>> >> index be3ab6d..05a80b2 100644<br>
>> >> --- a/src/compiler/nir/nir_<wbr>opcodes.py<br>
>> >> +++ b/src/compiler/nir/nir_<wbr>opcodes.py<br>
>> >> @@ -120,8 +120,10 @@ def opcode(name, output_size, output_type,<br>
>> >> input_sizes, input_types,<br>
>> >> input_types, convergent, cross_thread,<br>
>> >> algebraic_properties, const_expr)<br>
>> >><br>
>> >> -def unop_convert(name, out_type, in_type, const_expr):<br>
>> >> - opcode(name, 0, out_type, [0], [in_type], "", const_expr)<br>
>> >> +def unop_convert(name, out_type, in_type, const_expr,<br>
>> >> cross_thread=False,<br>
>> >> + convergent=False):<br>
>> >> + opcode(name, 0, out_type, [0], [in_type], "", const_expr,<br>
>> >> convergent,<br>
>> >> + cross_thread)<br>
>> >><br>
>> >> def unop(name, ty, const_expr, convergent=False, cross_thread=False):<br>
>> >> opcode(name, 0, ty, [0], [ty], "", const_expr, convergent,<br>
>> >> cross_thread)<br>
>> >> @@ -355,6 +357,18 @@ for i in xrange(1, 5):<br>
>> >> for j in xrange(1, 5):<br>
>> >> unop_horiz("fnoise{0}_{1}".<wbr>format(i, j), i, tfloat, j, tfloat,<br>
>> >> "0.0f")<br>
>> >><br>
>> >> +# ARB_shader_ballot instructions<br>
>> >> +<br>
>> >> +opcode("read_invocation", 0, tuint, [0, 1], [tuint, tuint32], "",<br>
>> >> "src0",<br>
>> >> + cross_thread=True)<br>
>> >> +unop("read_first_invocation", tuint, "src0", cross_thread=True)<br>
>> >> +<br>
>> >> +# ARB_shader_group_vote instructions<br>
>> >> +<br>
>> >> +unop("any_invocations", tbool, "src0", cross_thread=True)<br>
>> >> +unop("all_invocations", tbool, "src0", cross_thread=True)<br>
>> >> +unop("all_invocations_equal", tbool, "true", cross_thread=True)<br>
>> >> +<br>
>> >> def binop_convert(name, out_type, in_type, alg_props, const_expr):<br>
>> >> opcode(name, 0, out_type, [0, 0], [in_type, in_type], alg_props,<br>
>> >> const_expr)<br>
>> >><br>
>> >> --<br>
>> >> 2.9.3<br>
>> >><br>
>> >> ______________________________<wbr>_________________<br>
>> >> mesa-dev mailing list<br>
>> >> <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
>> >> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
>> ><br>
>> ><br>
><br>
><br>
</div></div></blockquote></div><br></div></div>