[Mesa-dev] [PATCH 1/2] ac: avoid infinite loop storing doubles to ssbo
Bas Nieuwenhuizen
bas at basnieuwenhuizen.nl
Wed Jan 17 13:36:06 UTC 2018
I put some alternative fixes together that make the test pass:
https://patchwork.freedesktop.org/series/36612/
On Wed, Jan 17, 2018 at 1:32 PM, Timothy Arceri <tarceri at itsqueeze.com> wrote:
>
>
> On 17/01/18 22:36, Bas Nieuwenhuizen wrote:
>>
>> On Wed, Jan 17, 2018 at 10:47 AM, Timothy Arceri <tarceri at itsqueeze.com>
>> wrote:
>>>
>>> Without this count will always be greater than 4 and we will always
>>> set the writemask so the loop can never exit.
>>>
>>> Fixes: 91074bb11bda "radv/ac: Implement Float64 SSBO stores."
>>> ---
>>> src/amd/common/ac_nir_to_llvm.c | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/src/amd/common/ac_nir_to_llvm.c
>>> b/src/amd/common/ac_nir_to_llvm.c
>>> index ddcd546b93..c24e563695 100644
>>> --- a/src/amd/common/ac_nir_to_llvm.c
>>> +++ b/src/amd/common/ac_nir_to_llvm.c
>>> @@ -2453,6 +2453,7 @@ static void visit_store_ssbo(struct ac_nir_context
>>> *ctx,
>>> if (count > 4) {
>>> writemask |= ((1u << (count - 4)) - 1u) <<
>>> (start + 4);
>>> count = 4;
>>> + elem_size_mul = 1;
>>
>>
>> This seems confusing to me. We do the initial iteration writemask
>> check before multiplying by elem_size_mul, so writemask is in terms of
>> 64-bit components, but after the first iteration, writemask is in
>> terms of 32-bit components? Looks like we should expand the bitmask
>> beforehand similarly as in visit_store_var, and then don't multiply by
>> elem_size_mul in the loop.
>
>
> I meant to add somewhere that this doesn't fix the test I'm looking at it
> just stops it locking up my machine. There are a bunch of other problems
> once we get past this point. e.g.
>
> Intrinsic name not mangled correctly for type arguments! Should be:
> llvm.amdgcn.buffer.store.v8f32
> void (<8 x float>, <4 x i32>, i32, i32, i1, i1)*
> @llvm.amdgcn.buffer.store.v4f32
> Intrinsic name not mangled correctly for type arguments! Should be:
> llvm.amdgcn.buffer.store.v8f32
> void (<8 x float>, <4 x i32>, i32, i32, i1, i1)*
> @llvm.amdgcn.buffer.store.v4f32
> Call parameter type does not match function signature!
> <4 x float> bitcast (<2 x double> <double 8.000000e+00, double 9.000000e+00>
> to <4 x float>)
> <8 x float> call void @llvm.amdgcn.buffer.store.v4f32(<4 x float> bitcast
> (<2 x double> <double 8.000000e+00, double 9.000000e+00> to <4 x float>), <4
> x i32> %59, i32 0, i32 112, i1 false, i1 false) #3
>
>
> I'm still not really sure how this code is intended to work.
>
> I'm using the ./bin/arb_gpu_shader_fp64-layout-std140-fp64-shader -auto
> piglit test for testing, note it also requires [1] to avoid crashing
> earlier.
>
> [1] https://patchwork.freedesktop.org/patch/197723/
>
>
>
>>
>>> }
>>>
>>> if (count == 4) {
>>> --
>>> 2.14.3
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list