[Mesa-dev] [PATCH 01/17] nir: introduce new convergent and cross-thread attributes

Mon Jun 12 18:58:41 UTC 2017

On 12.06.2017 20:50, Connor Abbott wrote:
> On Mon, Jun 12, 2017 at 2:17 AM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>> On 10.06.2017 01:44, Connor Abbott wrote:
>>>
>>> From: Connor Abbott <cwabbott0 at gmail.com>
>>>
>>> These are properties of the instruction that must be respected when
>>> moving it around, in addition to the usual SSA dominance guarantee.
>>> Previously, we only had special handling for fddx and fddy, in a very
>>> ad-hoc way. But with arb_shader_ballot and arb_shader_group_vote, we'll
>>> have to start handling a lot more instructions with similar constraints,
>>> so we want to add a more formal model of what the optimizer can and
>>> cannot do.
>>>
>>> v2: don't add attribute for ALU instructions
>>> v3: special-case derivative ALU instructions
>>> Signed-off-by: Connor Abbott <cwabbott0 at gmail.com>
>>> ---
>>>    src/compiler/nir/nir.h | 80
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 80 insertions(+)
>>>
>>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>>> index 3b827bf..64caccb 100644
>>> --- a/src/compiler/nir/nir.h
>>> +++ b/src/compiler/nir/nir.h
>>> @@ -985,6 +985,25 @@ typedef enum {
>>>        * intrinsic are due to the register reads/writes.
>>>        */
>>>       NIR_INTRINSIC_CAN_REORDER = (1 << 1),
>>> +
>>> +   /**
>>> +    * Indicates whether this intrinsic is "cross-thread". An operation is
>>> +    * cross-thread if results in one thread depend on inputs in another
>>> thread,
>>> +    * and therefore optimizations cannot change the execution mask when
>>> the
>>> +    * operation is called. Examples of cross-thread operations include
>>> +    * screen-space derivatives, the "any" reduction which returns "true"
>>> in
>>> +    * all threads if any thread inputs "true", etc.
>>> +    */
>>> +   NIR_INTRINSIC_CROSS_THREAD,
>>> +
>>> +   /**
>>> +    * Indicates that this intrinsic is "convergent". An operation is
>>> +    * convergent when it must always be called in convergent control
>>> flow,
>>> +    * that is, control flow with the same execution mask as when the
>>> program
>>> +    * started. If an operation is convergent, it must be cross-thread as
>>> well,
>>> +    * since the optimizer must maintain the guarantee.
>>> +    */
>>> +   NIR_INTRINSIC_CONVERGENT,
>>
>>
>> This is inconsistent with LLVM's definition of 'convergent', and I'd like
>> you to change it to match up with LLVM.
>>
>> LLVM's definition of convergent is: "The operation must not be made
>> control-dependent on additional values."
>>
>> In the language of execution masks, this means that optimizations must
>> guarantee that the execution mask for the instruction can only become a
>> superset of what it was originally. This means lifting is actually okay.
>>
>> This is relevant because e.g. texture instructions with implicit derivatives
>> are actually convergent operations (in the LLVM sense), but obviously they
>> can be called with exec masks that are subsets of the exec mask at program
>> start.
> 
> Actually, according to GLSL (and I think SPIR-V, although I'm not 100%
> sure), they can't be called that way -- results are undefined if
> derivatives (or textures that take implicit derivatives) aren't called
> in uniform control flow, full stop. That's why I changed the
> definition compared to LLVM - this definition of convergent allows all
> the optimizations that the LLVM definition does, but it opens up
> additional optimization opportunities since we can assume that control
> flow is always uniform when doing divergence analysis. Also, as-is,
> the definition matches the GLSL/SPIR-V semantics closely, and since
> the purpose of the convergent attribute is to model derivatives in
> GLSL and SPIR-V, I'd like to keep that. If GLSL or SPIR-V change their
> semantics to allow what you describe, then we can add something
> something closer to the LLVM convergent semantics. If you want me to
> change the name to avoid confusion with LLVM, that's fair though --
> suggestions welcome on what to call it ;)

Okay, I'm convinced that it makes sense to have these semantics, but a 
different name would be good.

How about NIR_INTRINSIC_UNIFORM_CONTROL?

Cheers,
Nicolai

> 
>>
>> LLVM currently has no equivalent to cross_thread, and we hack around it as
>> I'm sure you're well aware. The nightmare is trying to find a sound
>> definition of "cross_thread" that works in LLVM's execution model.
> 
> Yeah... this stuff is really tricky to reason about. I think that
> eventually, we're going to have to add the notions of control flow
> divergence and re-convergence to LLVM's execution model, even though
> there's been pushback from some LLVM developers about it. I just don't
> see any way we'll be able to do stuff like LICM, aggressive CSE, etc.
> effectively in the presence of this cross-thread operations, when
> whether you can do those things at all depends on whether branch
> conditions are uniform.
> 
>>
>> Cheers,
>> Nicolai
>>
>>
>>
>>>    } nir_intrinsic_semantic_flag;
>>>      /**
>>> @@ -1459,6 +1478,67 @@ NIR_DEFINE_CAST(nir_instr_as_parallel_copy,
>>> nir_instr,
>>>                    type, nir_instr_type_parallel_copy)
>>>      /*
>>> + * Helpers to determine if an instruction is cross-thread or convergent.
>>> See
>>> + * NIR_INTRINSIC_{CONVERGENT|CROSS_THREAD} for the definitions.
>>> + */
>>> +static inline bool
>>> +nir_instr_is_convergent(const nir_instr *instr)
>>> +{
>>> +   switch (instr->type) {
>>> +   case nir_instr_type_alu:
>>> +      switch (nir_instr_as_alu(instr)->op) {
>>> +      case nir_op_fddx:
>>> +      case nir_op_fddy:
>>> +      case nir_op_fddx_fine:
>>> +      case nir_op_fddy_fine:
>>> +      case nir_op_fddx_coarse:
>>> +      case nir_op_fddy_coarse:
>>> +         /* Partial derivatives are convergent */
>>> +         return true;
>>> +
>>> +      default:
>>> +         return false;
>>> +      }
>>> +
>>> +   case nir_instr_type_intrinsic: {
>>> +      nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>>> +      return nir_intrinsic_infos[intrin->intrinsic].flags &
>>> +         NIR_INTRINSIC_CONVERGENT;
>>> +   }
>>> +
>>> +   case nir_instr_type_tex:
>>> +         switch (nir_instr_as_tex(instr)->op) {
>>> +         case nir_texop_tex:
>>> +         case nir_texop_txb:
>>> +         case nir_texop_lod:
>>> +            /* These three take implicit derivatives, so they are
>>> convergent */
>>> +            return true;
>>> +
>>> +         default:
>>> +            return false;
>>> +         }
>>> +
>>> +   default:
>>> +      return false;
>>> +   }
>>> +}
>>> +
>>> +static inline bool
>>> +nir_instr_is_cross_thread(const nir_instr *instr)
>>> +{
>>> +   switch (instr->type) {
>>> +   case nir_instr_type_intrinsic: {
>>> +      nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>>> +      return nir_intrinsic_infos[intrin->intrinsic].flags &
>>> +         NIR_INTRINSIC_CROSS_THREAD;
>>> +   }
>>> +
>>> +   default:
>>> +      return nir_instr_is_convergent(instr);
>>> +   }
>>> +}
>>> +
>>> +/*
>>>     * Control flow
>>>     *
>>>     * Control flow consists of a tree of control flow nodes, which include
>>>
>>
>>
>> --
>> Lerne, wie die Welt wirklich ist,
>> Aber vergiss niemals, wie sie sein sollte.
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.