[Mesa-dev] [PATCH 4/6] nir: add shader_clock intrinsic

Connor Abbott cwabbott0 at gmail.com
Thu Oct 8 09:04:45 PDT 2015


On Thu, Oct 8, 2015 at 10:09 AM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
> On 8 October 2015 at 14:20, Connor Abbott <cwabbott0 at gmail.com> wrote:
>> On Thu, Oct 8, 2015 at 6:54 AM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>>> On 7 October 2015 at 23:50, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>>> On Wed, Oct 7, 2015 at 4:48 PM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>>>>> On 7 October 2015 at 18:04, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>>>>> On Wed, Oct 7, 2015 at 7:51 AM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>>>>>>> XXX: commit message, comment in nir_intrinsics.h
>>>>>>>
>>>>>>> Signed-off-by: Emil Velikov <emil.velikov at collabora.com>
>>>>>>> ---
>>>>>>>  src/glsl/nir/glsl_to_nir.cpp  | 6 ++++++
>>>>>>>  src/glsl/nir/nir_intrinsics.h | 2 ++
>>>>>>>  2 files changed, 8 insertions(+)
>>>>>>>
>>>>>>> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
>>>>>>> index efaa73e..231bdbf 100644
>>>>>>> --- a/src/glsl/nir/glsl_to_nir.cpp
>>>>>>> +++ b/src/glsl/nir/glsl_to_nir.cpp
>>>>>>> @@ -698,6 +698,8 @@ nir_visitor::visit(ir_call *ir)
>>>>>>>           op = nir_intrinsic_ssbo_atomic_exchange;
>>>>>>>        } else if (strcmp(ir->callee_name(), "__intrinsic_ssbo_atomic_comp_swap_internal") == 0) {
>>>>>>>           op = nir_intrinsic_ssbo_atomic_comp_swap;
>>>>>>> +      } else if (strcmp(ir->callee_name(), "__intrinsic_shader_clock") == 0) {
>>>>>>> +         op = nir_intrinsic_shader_clock;
>>>>>>>        } else {
>>>>>>>           unreachable("not reached");
>>>>>>>        }
>>>>>>> @@ -802,6 +804,10 @@ nir_visitor::visit(ir_call *ir)
>>>>>>>        case nir_intrinsic_memory_barrier:
>>>>>>>           nir_instr_insert_after_cf_list(this->cf_node_list, &instr->instr);
>>>>>>>           break;
>>>>>>> +      case nir_intrinsic_shader_clock:
>>>>>>> +         nir_ssa_dest_init(&instr->instr, &instr->dest, 1, NULL);
>>>>>>> +         nir_instr_insert_after_cf_list(this->cf_node_list, &instr->instr);
>>>>>>> +         break;
>>>>>>>        case nir_intrinsic_store_ssbo: {
>>>>>>>           exec_node *param = ir->actual_parameters.get_head();
>>>>>>>           ir_rvalue *block = ((ir_instruction *)param)->as_rvalue();
>>>>>>> diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
>>>>>>> index 263d8c1..4b32215 100644
>>>>>>> --- a/src/glsl/nir/nir_intrinsics.h
>>>>>>> +++ b/src/glsl/nir/nir_intrinsics.h
>>>>>>> @@ -83,6 +83,8 @@ BARRIER(discard)
>>>>>>>   */
>>>>>>>  BARRIER(memory_barrier)
>>>>>>>
>>>>>>> +INTRINSIC(shader_clock, 0, ARR(), true, 1, 1, 0, 0 /* flags ? */)
>>>>>>
>>>>>> This should have NIR_INTRINSIC_CAN_DELETE, since if the result is
>>>>>> unused we can safely delete it (i.e. it has no side effects), but we
>>>>>> can't safely reorder it.
>>>>>>
>>>>> Thanks. Will do.
>>>>>
>>>>>> Side note: NIR's current model, as well as any more flexible memory
>>>>>> model we might adopt in the future, assumes that intrinsics which are
>>>>>> marked as reorderable, as well as ALU operations which are implicitly
>>>>>> reorderable, can be freely reordered with respect to *any* other
>>>>>> operation, even one that's explicitly not reorderable. So, for
>>>>>> example, if you do:
>>>>>>
>>>>>> ... = clock();
>>>>>> a = b + c;
>>>>>> ... = clock();
>>>>>>
>>>>>> then there are no guarantees that the addition won't get moved outside
>>>>>> the clock() calls. Currently, this will only happen if the addition
>>>>>> becomes involved in some algebraic optimization or CSE, but in the
>>>>>> future with passes like GCM that move code around indiscriminately
>>>>>> it's going to be much more of a problem. I don't think we could really
>>>>>> solve this problem in a useful and general way without making the rest
>>>>>> of NIR significantly more complicated and slower, which I definitely
>>>>>> don't want. I think the best answer is to say "really these tools are
>>>>>> unreliable and meant mainly for driver developers and people who know
>>>>>> what they're doing, and if you use them you have to be prepared to
>>>>>> look at the assembly source and see if it matches what you expected."
>>>>>>
>>>>> I haven't looked at the optimisations closely and I assumed that all
>>>>> intrinsics act as motion barriers. Seems like I was mistaking.
>>>>> Can we call it a "where sub-group is implementation dependent" and be
>>>>> done with it ;-)
>>>>
>>>> Or even "The units of time are not defined and need not be constant"
>>>> -- I guess "return 0;" would be a legal implementation ;).
>>>>
>>> Bikeshedding aside - the spec is quite clear about the motion barrier
>>> part. Personally I'm fine either way - leave it as is or look closer
>>> at NIR. Just let know how you feel on the topic.
>>
>> It's my opinion, and Jason's as well, that implementing a "code motion
>> barrier" as the spec describes is practically impossible in NIR or
>> really any decent SSA-based IR. So we just can't follow that part of
>> the spec.
>>
>>>
>>>> But really the issue isn't with spec lawyering, it's with people
>>>> potentially using it without knowing the caveats about the underlying
>>>> compiler stack and how it might not always do what they think it does.
>>>>
>>> Replace "compiler stack" with "object X" and you can apply it to
>>> everything in life :-P
>>
>> True :) although in this case, it's perhaps a bit more serious since
>> it's against the "spirit" of the spec. But again, there's nothing we
>> can really do about it.
>>
> Fair enough. I am not going to dig any more on the topic.
>
> Can you suggest some patterns/sequences that won't get optimised (too
> heavily) to use as piglit tests ? Would be nice to not have them
> regress randomly, when opt X gets added/removed.
>
> Cheers
> Emil

As for tests... I don't think there's too much you can realistically
test. The units are unspecified, due to context switches and other
things the time might vary wildly for the same program, the counter is
supposed to be monotonically increasing except that it can wrap
around, etc. I don't think that you could do much to test it beyond
some simple touch-testing (does the function exist, can I call it
without crashing, etc.) without some platform-specific knowledge.


More information about the mesa-dev mailing list