[Mesa-dev] V2 radeonsi use STD430 packing of UBOs by default

Marek Olšák maraeo at gmail.com
Tue Aug 29 23:58:47 UTC 2017


Interesting. It may be that glsl_to_tgsi uses copy propagation to fold
those CONST loads into operands, which puts them next to their uses in LLVM.

I guess LLVM doesn't understand that s_buffer_load_dword loads from
immutable dereferenceable memory. It would benefit from mayLoad = 0 in
this case I think.

Marek

On Thu, Aug 24, 2017 at 11:48 AM, Timothy Arceri <tarceri at itsqueeze.com> wrote:
>
>
> On 24/08/17 18:12, Nicolai Hähnle wrote:
>>
>> On 24.08.2017 09:45, Timothy Arceri wrote:
>>>
>>>
>>>
>>> On 22/08/17 22:14, Timothy Arceri wrote:
>>>>
>>>> I'm a little unsure what to do with this now. Below is my shader-db
>>>> results, the majority of negative changes are from Natural Selection
>>>> 2.
>>>>
>>>> I looked at some dumps of the worst Natural Selection 2 shaders and
>>>> it seems to just be scheduling differences causing the regressions.
>>>>
>>>> I tested with sisched but that just made things even worse.
>>>>
>>>> Obviously we should be aiming to improve the schedulare, but since
>>>> this regresses things and I have no evidence of it helping anything
>>>> it makes the case for adding it pretty weak.
>>>>
>>>> Thoughts??
>>>>
>>>> PERCENTAGE DELTAS    Shaders     SGPRs     VGPRs SpillSGPR  MaxWaves
>>>> --------------------------------------------------------------------
>>>>   All affected            5797    2.92     3.05 %    5.04 %   -2.94
>>>>   -------------------------------------------------------------------
>>>>   Total                  72287    0.28 %    0.34 %    0.33 %  -0.21 %
>>>>
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>
>>>
>>>
>>> As far as I can tell this is because after this chnage we end up with
>>> large sections of consecutive loads. Any thoughts on avoid this?
>>
>>
>> Odd. Do you see the same change in TGSI?
>>
>> This is one of those things that ideally LLVM would be smart about, but
>> unfortunately it isn't really.
>
>
> Yeah I assume it's very doable since SSA makes this stuff reasonably easy to
> deal with. However I'm not really sure where to begin, or how welcome a pass
> to do this sorting would be. We have a similar pass in nir for moving
> comparisons to where they are first used.
>
> The TGSI is introduces an extra temp to store the value of the LOAD, this is
> probably what triggers the difference in LLVM.
>
> eg.
>
>  LOAD TEMP[61], UBO[2], IMM[2].yyyy
>  LOAD TEMP[62], UBO[2], IMM[1].zzzz
>  LOAD TEMP[63], UBO[2], IMM[1].wwww
>  LOAD TEMP[64], UBO[2], IMM[2].xxxx
>  DP4 TEMP[65].x, TEMP[60], TEMP[61]
>  DP4 TEMP[66].x, TEMP[60], TEMP[62]
>  MOV TEMP[65].y, TEMP[66].xxxx
>  DP4 TEMP[67].x, TEMP[60], TEMP[63]
>  MOV TEMP[65].z, TEMP[67].xxxx
>  DP4 TEMP[68].x, TEMP[60], TEMP[64]
>  MOV TEMP[69].w, TEMP[68].xxxx
>  MOV TEMP[69].xyz, TEMP[65].xyzx
>  LOAD TEMP[70], UBO[1], IMM[6].yyyy
>  LOAD TEMP[71], UBO[1], IMM[6].zzzz
>  DP4 TEMP[72].x, TEMP[69], TEMP[70]
>  DP4 TEMP[73].x, TEMP[69], TEMP[71]
>  LOAD TEMP[74], UBO[1], IMM[6].wwww
>  LOAD TEMP[75], UBO[1], IMM[7].xxxx
>  LOAD TEMP[76], UBO[1], IMM[7].yyyy
>  LOAD TEMP[77], UBO[1], IMM[7].zzzz
>  DP4 TEMP[78].x, TEMP[69], TEMP[74]
>  DP4 TEMP[79].x, TEMP[69], TEMP[75]
>  MOV TEMP[78].y, TEMP[79].xxxx
>  DP4 TEMP[80].x, TEMP[69], TEMP[76]
>  MOV TEMP[78].z, TEMP[80].xxxx
>  DP4 TEMP[81].x, TEMP[69], TEMP[77]
>  MOV TEMP[78].w, TEMP[81].xxxx
>
> vs
>
>  DP4 TEMP[63].x, TEMP[62], CONST[2][0]
>  DP4 TEMP[64].x, TEMP[62], CONST[2][1]
>  MOV TEMP[63].y, TEMP[64].xxxx
>  DP4 TEMP[65].x, TEMP[62], CONST[2][2]
>  MOV TEMP[63].z, TEMP[65].xxxx
>  DP4 TEMP[66].x, TEMP[62], CONST[2][3]
>  MOV TEMP[67].w, TEMP[66].xxxx
>  MOV TEMP[67].xyz, TEMP[63].xyzx
>  DP4 TEMP[68].x, TEMP[67], CONST[1][14]
>  DP4 TEMP[69].x, TEMP[67], CONST[1][15]
>  DP4 TEMP[70].x, TEMP[67], CONST[1][8]
>  DP4 TEMP[71].x, TEMP[67], CONST[1][9]
>  MOV TEMP[70].y, TEMP[71].xxxx
>  DP4 TEMP[72].x, TEMP[67], CONST[1][10]
>  MOV TEMP[70].z, TEMP[72].xxxx
>  DP4 TEMP[73].x, TEMP[67], CONST[1][11]
>  MOV TEMP[70].w, TEMP[73].xxxx
>  MOV TEMP[74].xyw, TEMP[70].xyxw
>
>>
>> Cheers,
>> Nicolai
>>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list