[Mesa-dev] [PATCH 03/17] nir: take cross-thread operations into account into a few places

Nicolai Hähnle nhaehnle at gmail.com
Mon Jun 12 19:07:09 UTC 2017


On 12.06.2017 20:58, Connor Abbott wrote:
> On Mon, Jun 12, 2017 at 2:21 AM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>> On 10.06.2017 01:44, Connor Abbott wrote:
>>>
>>> From: Connor Abbott <cwabbott0 at gmail.com>
>>>
>>> These optimizations happened to work with derivatives, but they won't
>>> with upcoming shader_ballot and group_vote instructions.
>>>
>>> Signed-off-by: Connor Abbott <cwabbott0 at gmail.com>
>>> ---
>>>    src/compiler/nir/nir_instr_set.c           | 22 ++++++++++++++++++++++
>>>    src/compiler/nir/nir_opt_peephole_select.c | 11 +++++++++++
>>>    2 files changed, 33 insertions(+)
>>>
>>> diff --git a/src/compiler/nir/nir_instr_set.c
>>> b/src/compiler/nir/nir_instr_set.c
>>> index 9cb9ed4..4bd0717 100644
>>> --- a/src/compiler/nir/nir_instr_set.c
>>> +++ b/src/compiler/nir/nir_instr_set.c
>>> @@ -178,6 +178,14 @@ hash_instr(const void *data)
>>>       const nir_instr *instr = data;
>>>       uint32_t hash = _mesa_fnv32_1a_offset_bias;
>>>    +   /*
>>> +    * In nir_instrs_equal(), we compare the instruction's basic blocks in
>>> this
>>> +    * case. See the comment there for the explanation.
>>> +    */
>>> +   if (nir_instr_is_cross_thread(instr) &&
>>> !nir_instr_is_convergent(instr)) {
>>> +      HASH(hash, instr->block);
>>> +   }
>>> +
>>>       switch (instr->type) {
>>>       case nir_instr_type_alu:
>>>          hash = hash_alu(hash, nir_instr_as_alu(instr));
>>> @@ -256,6 +264,20 @@ nir_instrs_equal(const nir_instr *instr1, const
>>> nir_instr *instr2)
>>>       if (instr1->type != instr2->type)
>>>          return false;
>>>    +   /*
>>> +    * If the instructions are cross-thread, then they must have the same
>>> +    * execution mask. If they are convergent, then we can always replace
>>> one
>>> +    * invocation with another since every invocation is guaranteed
>>> convergent.
>>> +    * But not so for non-convergent instructions, since different
>>> invocations
>>> +    * may be called with different execution maskes and therefore have
>>> +    * different results. Conservatively enforce that the instructions are
>>> in
>>> +    * the same basic block.
>>> +    */
>>> +   if (nir_instr_is_cross_thread(instr1) &&
>>> !nir_instr_is_convergent(instr1)) {
>>
>>
>> Hmm... this is another reason not to like the definition of "cross thread"
>> and "convergent". It seems like crossthread + convergent is a weaker
>> restriction that only crossthread, which I'd say is inherently unintuitive.
>> Can we make it so that each attribute only makes things more restrictive?
> 
> That's because you're thinking of it the wrong way -- here, convergent
> isn't a restriction. It's a guarantee, similar to nsw, nuw,
> UnsafeFPMath, etc. in LLVM. In particular, it's the guarantee that the
> user will never call this in non-uniform control flow. In the presence
> of this guarantee, cross_thread and LLVM convergent are actually the
> same restriction -- you can never make the exec mask larger in the
> first place, since it's already as large as possible. But the
> guarantee can enable additional optimizations as well since the
> optimizer can assume that control in the basic block (and other
> control-equivalent basic blocks) is dynamically uniform.

Okay, gotcha. That's actually a neat trick. There's still the issue of 
naming, but apart from that, patches 1-3 look good to me.

Cheers,
Nicolai


> 
>>
>> Cheers,
>> Nicolai
>>
>>
>>
>>> +      if (instr1->block != instr2->block)
>>> +         return false;
>>> +   }
>>> +
>>>       switch (instr1->type) {
>>>       case nir_instr_type_alu: {
>>>          nir_alu_instr *alu1 = nir_instr_as_alu(instr1);
>>> diff --git a/src/compiler/nir/nir_opt_peephole_select.c
>>> b/src/compiler/nir/nir_opt_peephole_select.c
>>> index 4ca4f80..ce41781 100644
>>> --- a/src/compiler/nir/nir_opt_peephole_select.c
>>> +++ b/src/compiler/nir/nir_opt_peephole_select.c
>>> @@ -61,6 +61,17 @@ static bool
>>>    block_check_for_allowed_instrs(nir_block *block, unsigned *count, bool
>>> alu_ok)
>>>    {
>>>       nir_foreach_instr(instr, block) {
>>> +      if (nir_instr_is_cross_thread(instr) &&
>>> !nir_instr_is_convergent(instr)) {
>>> +         /* If the instruction is cross-thread, then we can't execute it
>>> +          * conditionally when we would've executed it unconditionally
>>> before,
>>> +          * except when the condition is uniform. If the instruction is
>>> +          * convergent, though, we're already guaranteed that the entire
>>> +          * region is convergent (including the condition) so we can go
>>> ahead.
>>> +          *
>>> +          * TODO: allow when the if-condition is uniform
>>> +          */
>>> +         return false;
>>> +      }
>>>          switch (instr->type) {
>>>          case nir_instr_type_intrinsic: {
>>>             nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>>>
>>
>>
>> --
>> Lerne, wie die Welt wirklich ist,
>> Aber vergiss niemals, wie sie sein sollte.
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.


More information about the mesa-dev mailing list