[Mesa-dev] [PATCH] ac: fix exclusive scans on GFX8-GFX9

Samuel Pitoiset samuel.pitoiset at gmail.com
Wed Aug 21 14:07:11 UTC 2019


On 8/21/19 3:59 PM, Bas Nieuwenhuizen wrote:
> On Wed, Aug 21, 2019 at 3:45 PM Samuel Pitoiset
> <samuel.pitoiset at gmail.com> wrote:
>> This fixes a regression introduced with scan&reduce operations
>> on GFX10. Note that some subgroups CTS still fail on GFX10 but
>> I assume it's a different issue.
>>
>> This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.
>>
>> Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
>> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
>> ---
>>   src/amd/common/ac_llvm_build.c | 7 +++----
>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
>> index 05871f5ea98..d72eaa2db46 100644
>> --- a/src/amd/common/ac_llvm_build.c
>> +++ b/src/amd/common/ac_llvm_build.c
>> @@ -4221,10 +4221,7 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, LLVMValu
>>          if (ctx->chip_class >= GFX10) {
>>                  result = inclusive ? src : identity;
>>          } else {
>> -               if (inclusive)
>> -                       result = src;
>> -               else
>> -                       result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
>> +               result = src;
>>          }
>>          if (maxprefix <= 1)
>>                  return result;
>> @@ -4333,6 +4330,8 @@ ac_build_exclusive_scan(struct ac_llvm_context *ctx, LLVMValueRef src, nir_op op
>>                  get_reduction_identity(ctx, op, ac_get_type_size(LLVMTypeOf(src)));
>>          result = LLVMBuildBitCast(ctx->builder, ac_build_set_inactive(ctx, src, identity),
>>                                    LLVMTypeOf(identity), "");
>> +       if (ctx->chip_class <= GFX9)
>> +               result = ac_build_dpp(ctx, identity, result, dpp_wf_sr1, 0xf, 0xf, false);
> Kinda annoying that we still do the inclusive/exclusive logic for
> gfx10 inside ac_build_scan. Can we keep this inside the function by
> using a intermediate src?
Looks better indeed, I will make that change.
>
>>          result = ac_build_scan(ctx, op, result, identity, ctx->wave_size, false);
>>
>>          return ac_build_wwm(ctx, result);
>> --
>> 2.22.1
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list