[Mesa-dev] [PATCH 12/15] ac: add support for SPV_AMD_shader_ballot

Thu Nov 2 16:10:22 UTC 2017

On 31.10.2017 16:36, Connor Abbott wrote:
> On Tue, Oct 31, 2017 at 2:08 AM, Dave Airlie <airlied at gmail.com> wrote:
>>> +LLVMValueRef
>>> +ac_build_subgroup_inclusive_scan(struct ac_llvm_context *ctx,
>>> +                                LLVMValueRef src,
>>> +                                ac_reduce_op reduce,
>>> +                                LLVMValueRef identity)
>>> +{
>>> +       /* See http://gpuopen.com/amd-gcn-assembly-cross-lane-operations/
>>> +        *
>>> +        * Note that each dpp/reduce pair is supposed to be compiled down to
>>> +        * one instruction by LLVM, at least for 32-bit values.
>>> +        *
>>> +        * TODO: use @llvm.amdgcn.ds.swizzle on SI and CI
>>> +        */
>>> +       LLVMValueRef value = src;
>>> +       value = reduce(ctx, value,
>>> +                      ac_build_dpp(ctx, identity, src,
>>> +                                   dpp_row_sr(1), 0xf, 0xf, false));
>>> +       value = reduce(ctx, value,
>>> +                      ac_build_dpp(ctx, identity, src,
>>> +                                   dpp_row_sr(2), 0xf, 0xf, false));
>>> +       value = reduce(ctx, value,
>>> +                      ac_build_dpp(ctx, identity, src,
>>> +                                   dpp_row_sr(3), 0xf, 0xf, false));
>>> +       value = reduce(ctx, value,
>>> +                      ac_build_dpp(ctx, identity, value,
>>> +                                   dpp_row_sr(4), 0xf, 0xe, false));
>>> +       value = reduce(ctx, value,
>>> +                      ac_build_dpp(ctx, identity, value,
>>> +                                   dpp_row_sr(8), 0xf, 0xc, false));
>>> +       value = reduce(ctx, value,
>>> +                      ac_build_dpp(ctx, identity, value,
>>> +                                   dpp_row_bcast15, 0xa, 0xf, false));
>>> +       value = reduce(ctx, value,
>>> +                      ac_build_dpp(ctx, identity, value,
>>> +                                   dpp_row_bcast31, 0xc, 0xf, false));
>>
>> btw I dumped some shaders from doom on pro,
>>
>> it looked like it ended up with
>>
>> 1, 0xf, 0xf,
>> 2, 0xf, 0xf,
>> 4, 0xf, 0xf
>> 8, 0xf, 0xf
>> bcast15 0xa, 0xf
>> bcast31 0xc, 0xf
>>
>> It also seems to apply these direct to instructions like
>> /*000000002b80*/ s_nop           0x0
>> /*000000002b84*/ v_min_u32       v83, v83, v83 row_shr:1 bank_mask:15
>> row_mask:15
>> /*000000002b8c*/ s_nop           0x1
>> /*000000002b90*/ v_min_u32       v83, v83, v83 row_shr:2 bank_mask:15
>> row_mask:15
>> /*000000002b98*/ s_nop           0x1
>> /*000000002b9c*/ v_min_u32       v83, v83, v83 row_shr:4 bank_mask:15
>> row_mask:15
>> /*000000002ba4*/ s_nop           0x1
>> /*000000002ba8*/ v_min_u32       v83, v83, v83 row_shr:8 bank_mask:15
>> row_mask:15
>> /*000000002bb0*/ s_nop           0x1
>> /*000000002bb4*/ v_min_u32       v83, v83, v83 row_bcast15
>> bank_mask:15 row_mask:10
>> /*000000002bbc*/ s_nop           0x1
>> /*000000002bc0*/ v_min_u32       v83, v83, v83 row_bcast31
>> bank_mask:15 row_mask:12
>>
>> I think the instruction combining is probably an llvm job, but I
>> wonder if the different row_shr
>> etc is what we should use as well.
> 
> Yeah, LLVM should be combining the move and min -- hence the comment
> here -- but it isn't yet. That shouldn't be too hard to do once we get
> it working. Also, I've seen that way of doing it before, and IIRC it's
> one instruction slower than the sequence in the blog post I cited,
> since even though there's one less instruction, there's an extra
> two-cycle stall between the first two instructions since v83 is the
> destination of the first instruction and DPP source of the second
> (hence the s_nop 0x1). So once we combine instructions this should be
> better than what -pro does :)

Agreed, though even more ideally, LLVM would be able to fill those gaps 
with other instructions ;)

Anyway, the combining of instructions is really the important task.

Cheers,
Nicolai

> 
>>
>> Dave.
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.