[Mesa-dev] [PATCH 01/17] nir: introduce new convergent and cross-thread attributes

Jason Ekstrand jason at jlekstrand.net
Tue Jun 13 02:19:30 UTC 2017


On Mon, Jun 12, 2017 at 11:58 AM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:

> On 12.06.2017 20:50, Connor Abbott wrote:
>
>> On Mon, Jun 12, 2017 at 2:17 AM, Nicolai Hähnle <nhaehnle at gmail.com>
>> wrote:
>>
>>> On 10.06.2017 01:44, Connor Abbott wrote:
>>>
>>>>
>>>> From: Connor Abbott <cwabbott0 at gmail.com>
>>>>
>>>> These are properties of the instruction that must be respected when
>>>> moving it around, in addition to the usual SSA dominance guarantee.
>>>> Previously, we only had special handling for fddx and fddy, in a very
>>>> ad-hoc way. But with arb_shader_ballot and arb_shader_group_vote, we'll
>>>> have to start handling a lot more instructions with similar constraints,
>>>> so we want to add a more formal model of what the optimizer can and
>>>> cannot do.
>>>>
>>>> v2: don't add attribute for ALU instructions
>>>> v3: special-case derivative ALU instructions
>>>> Signed-off-by: Connor Abbott <cwabbott0 at gmail.com>
>>>> ---
>>>>    src/compiler/nir/nir.h | 80
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    1 file changed, 80 insertions(+)
>>>>
>>>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>>>> index 3b827bf..64caccb 100644
>>>> --- a/src/compiler/nir/nir.h
>>>> +++ b/src/compiler/nir/nir.h
>>>> @@ -985,6 +985,25 @@ typedef enum {
>>>>        * intrinsic are due to the register reads/writes.
>>>>        */
>>>>       NIR_INTRINSIC_CAN_REORDER = (1 << 1),
>>>> +
>>>> +   /**
>>>> +    * Indicates whether this intrinsic is "cross-thread". An operation
>>>> is
>>>> +    * cross-thread if results in one thread depend on inputs in another
>>>> thread,
>>>> +    * and therefore optimizations cannot change the execution mask when
>>>> the
>>>> +    * operation is called. Examples of cross-thread operations include
>>>> +    * screen-space derivatives, the "any" reduction which returns
>>>> "true"
>>>> in
>>>> +    * all threads if any thread inputs "true", etc.
>>>> +    */
>>>> +   NIR_INTRINSIC_CROSS_THREAD,
>>>> +
>>>> +   /**
>>>> +    * Indicates that this intrinsic is "convergent". An operation is
>>>> +    * convergent when it must always be called in convergent control
>>>> flow,
>>>> +    * that is, control flow with the same execution mask as when the
>>>> program
>>>> +    * started. If an operation is convergent, it must be cross-thread
>>>> as
>>>> well,
>>>> +    * since the optimizer must maintain the guarantee.
>>>> +    */
>>>> +   NIR_INTRINSIC_CONVERGENT,
>>>>
>>>
>>>
>>> This is inconsistent with LLVM's definition of 'convergent', and I'd like
>>> you to change it to match up with LLVM.
>>>
>>> LLVM's definition of convergent is: "The operation must not be made
>>> control-dependent on additional values."
>>>
>>> In the language of execution masks, this means that optimizations must
>>> guarantee that the execution mask for the instruction can only become a
>>> superset of what it was originally. This means lifting is actually okay.
>>>
>>> This is relevant because e.g. texture instructions with implicit
>>> derivatives
>>> are actually convergent operations (in the LLVM sense), but obviously
>>> they
>>> can be called with exec masks that are subsets of the exec mask at
>>> program
>>> start.
>>>
>>
>> Actually, according to GLSL (and I think SPIR-V, although I'm not 100%
>> sure), they can't be called that way -- results are undefined if
>> derivatives (or textures that take implicit derivatives) aren't called
>> in uniform control flow, full stop. That's why I changed the
>> definition compared to LLVM - this definition of convergent allows all
>> the optimizations that the LLVM definition does, but it opens up
>> additional optimization opportunities since we can assume that control
>> flow is always uniform when doing divergence analysis. Also, as-is,
>> the definition matches the GLSL/SPIR-V semantics closely, and since
>> the purpose of the convergent attribute is to model derivatives in
>> GLSL and SPIR-V, I'd like to keep that. If GLSL or SPIR-V change their
>> semantics to allow what you describe, then we can add something
>> something closer to the LLVM convergent semantics. If you want me to
>> change the name to avoid confusion with LLVM, that's fair though --
>> suggestions welcome on what to call it ;)
>>
>
> Okay, I'm convinced that it makes sense to have these semantics, but a
> different name would be good.
>

I'm not quite so convinced. :-)  The LLVM definition seems, at first brush,
more powerful than the proposed definition and I think it's actually what
you want for most optimizations.  The only advantage I can see to the
strict uniform definition is that it would let us imply information about
control-flow uniformity from instructions.  However, while probably
technically correct, that sounds like a dangerous path to go down.  What
specific optimizations were you thinking this stricter definition would
enable?


> How about NIR_INTRINSIC_UNIFORM_CONTROL?
>

That works.

--Jason


> Cheers,
> Nicolai
>
>
>
>
>
>>
>>> LLVM currently has no equivalent to cross_thread, and we hack around it
>>> as
>>> I'm sure you're well aware. The nightmare is trying to find a sound
>>> definition of "cross_thread" that works in LLVM's execution model.
>>>
>>
>> Yeah... this stuff is really tricky to reason about. I think that
>> eventually, we're going to have to add the notions of control flow
>> divergence and re-convergence to LLVM's execution model, even though
>> there's been pushback from some LLVM developers about it. I just don't
>> see any way we'll be able to do stuff like LICM, aggressive CSE, etc.
>> effectively in the presence of this cross-thread operations, when
>> whether you can do those things at all depends on whether branch
>> conditions are uniform.
>>
>>
>>> Cheers,
>>> Nicolai
>>>
>>>
>>>
>>>    } nir_intrinsic_semantic_flag;
>>>>      /**
>>>> @@ -1459,6 +1478,67 @@ NIR_DEFINE_CAST(nir_instr_as_parallel_copy,
>>>> nir_instr,
>>>>                    type, nir_instr_type_parallel_copy)
>>>>      /*
>>>> + * Helpers to determine if an instruction is cross-thread or
>>>> convergent.
>>>> See
>>>> + * NIR_INTRINSIC_{CONVERGENT|CROSS_THREAD} for the definitions.
>>>> + */
>>>> +static inline bool
>>>> +nir_instr_is_convergent(const nir_instr *instr)
>>>> +{
>>>> +   switch (instr->type) {
>>>> +   case nir_instr_type_alu:
>>>> +      switch (nir_instr_as_alu(instr)->op) {
>>>> +      case nir_op_fddx:
>>>> +      case nir_op_fddy:
>>>> +      case nir_op_fddx_fine:
>>>> +      case nir_op_fddy_fine:
>>>> +      case nir_op_fddx_coarse:
>>>> +      case nir_op_fddy_coarse:
>>>> +         /* Partial derivatives are convergent */
>>>> +         return true;
>>>> +
>>>> +      default:
>>>> +         return false;
>>>> +      }
>>>> +
>>>> +   case nir_instr_type_intrinsic: {
>>>> +      nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>>>> +      return nir_intrinsic_infos[intrin->intrinsic].flags &
>>>> +         NIR_INTRINSIC_CONVERGENT;
>>>> +   }
>>>> +
>>>> +   case nir_instr_type_tex:
>>>> +         switch (nir_instr_as_tex(instr)->op) {
>>>> +         case nir_texop_tex:
>>>> +         case nir_texop_txb:
>>>> +         case nir_texop_lod:
>>>> +            /* These three take implicit derivatives, so they are
>>>> convergent */
>>>> +            return true;
>>>> +
>>>> +         default:
>>>> +            return false;
>>>> +         }
>>>> +
>>>> +   default:
>>>> +      return false;
>>>> +   }
>>>> +}
>>>> +
>>>> +static inline bool
>>>> +nir_instr_is_cross_thread(const nir_instr *instr)
>>>> +{
>>>> +   switch (instr->type) {
>>>> +   case nir_instr_type_intrinsic: {
>>>> +      nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>>>> +      return nir_intrinsic_infos[intrin->intrinsic].flags &
>>>> +         NIR_INTRINSIC_CROSS_THREAD;
>>>> +   }
>>>> +
>>>> +   default:
>>>> +      return nir_instr_is_convergent(instr);
>>>> +   }
>>>> +}
>>>> +
>>>> +/*
>>>>     * Control flow
>>>>     *
>>>>     * Control flow consists of a tree of control flow nodes, which
>>>> include
>>>>
>>>>
>>>
>>> --
>>> Lerne, wie die Welt wirklich ist,
>>> Aber vergiss niemals, wie sie sein sollte.
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>>
>
> --
> Lerne, wie die Welt wirklich ist,
> Aber vergiss niemals, wie sie sein sollte.
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170612/3e90ead4/attachment-0001.html>


More information about the mesa-dev mailing list