[Mesa-dev] [PATCH 4/6] nir: add shader_clock intrinsic

Wed Oct 7 13:48:52 PDT 2015

On 7 October 2015 at 18:04, Connor Abbott <cwabbott0 at gmail.com> wrote:
> On Wed, Oct 7, 2015 at 7:51 AM, Emil Velikov <emil.l.velikov at gmail.com> wrote:
>> XXX: commit message, comment in nir_intrinsics.h
>>
>> Signed-off-by: Emil Velikov <emil.velikov at collabora.com>
>> ---
>>  src/glsl/nir/glsl_to_nir.cpp  | 6 ++++++
>>  src/glsl/nir/nir_intrinsics.h | 2 ++
>>  2 files changed, 8 insertions(+)
>>
>> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
>> index efaa73e..231bdbf 100644
>> --- a/src/glsl/nir/glsl_to_nir.cpp
>> +++ b/src/glsl/nir/glsl_to_nir.cpp
>> @@ -698,6 +698,8 @@ nir_visitor::visit(ir_call *ir)
>>           op = nir_intrinsic_ssbo_atomic_exchange;
>>        } else if (strcmp(ir->callee_name(), "__intrinsic_ssbo_atomic_comp_swap_internal") == 0) {
>>           op = nir_intrinsic_ssbo_atomic_comp_swap;
>> +      } else if (strcmp(ir->callee_name(), "__intrinsic_shader_clock") == 0) {
>> +         op = nir_intrinsic_shader_clock;
>>        } else {
>>           unreachable("not reached");
>>        }
>> @@ -802,6 +804,10 @@ nir_visitor::visit(ir_call *ir)
>>        case nir_intrinsic_memory_barrier:
>>           nir_instr_insert_after_cf_list(this->cf_node_list, &instr->instr);
>>           break;
>> +      case nir_intrinsic_shader_clock:
>> +         nir_ssa_dest_init(&instr->instr, &instr->dest, 1, NULL);
>> +         nir_instr_insert_after_cf_list(this->cf_node_list, &instr->instr);
>> +         break;
>>        case nir_intrinsic_store_ssbo: {
>>           exec_node *param = ir->actual_parameters.get_head();
>>           ir_rvalue *block = ((ir_instruction *)param)->as_rvalue();
>> diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
>> index 263d8c1..4b32215 100644
>> --- a/src/glsl/nir/nir_intrinsics.h
>> +++ b/src/glsl/nir/nir_intrinsics.h
>> @@ -83,6 +83,8 @@ BARRIER(discard)
>>   */
>>  BARRIER(memory_barrier)
>>
>> +INTRINSIC(shader_clock, 0, ARR(), true, 1, 1, 0, 0 /* flags ? */)
>
> This should have NIR_INTRINSIC_CAN_DELETE, since if the result is
> unused we can safely delete it (i.e. it has no side effects), but we
> can't safely reorder it.
>
Thanks. Will do.

> Side note: NIR's current model, as well as any more flexible memory
> model we might adopt in the future, assumes that intrinsics which are
> marked as reorderable, as well as ALU operations which are implicitly
> reorderable, can be freely reordered with respect to *any* other
> operation, even one that's explicitly not reorderable. So, for
> example, if you do:
>
> ... = clock();
> a = b + c;
> ... = clock();
>
> then there are no guarantees that the addition won't get moved outside
> the clock() calls. Currently, this will only happen if the addition
> becomes involved in some algebraic optimization or CSE, but in the
> future with passes like GCM that move code around indiscriminately
> it's going to be much more of a problem. I don't think we could really
> solve this problem in a useful and general way without making the rest
> of NIR significantly more complicated and slower, which I definitely
> don't want. I think the best answer is to say "really these tools are
> unreliable and meant mainly for driver developers and people who know
> what they're doing, and if you use them you have to be prepared to
> look at the assembly source and see if it matches what you expected."
>
I haven't looked at the optimisations closely and I assumed that all
intrinsics act as motion barriers. Seems like I was mistaking.
Can we call it a "where sub-group is implementation dependent" and be
done with it ;-)

Thanks
Emil