[Mesa-dev] V2 radeonsi use STD430 packing of UBOs by default

Marek Olšák maraeo at gmail.com
Wed Aug 30 00:25:06 UTC 2017


I have to conclude that I don't see a way to use LOAD with CONSTBUF
and keep the same performance as before. It looks like there are some
deficiencies in our compiler stack that are unfixable in Mesa alone.

Marek

On Wed, Aug 30, 2017 at 2:11 AM, Marek Olšák <maraeo at gmail.com> wrote:
> Related IRC discussion:
>
> 00:01 < mareko> arsenm: what are the chances I can convince you to
> allow me to set mayLoad = 0 on s_buffer_load_dword? :) the instruction
> always reads from read-only memory with Mesa
> 00:02 < mareko> apparently, readnone doesn't get through
> 00:02 < arsenm> mareko: you should get the same effect by having
> invariant on the MMO
> 00:03 < mareko> arsenm: and how would I set invariant on SI.load.const?
> 00:04 < arsenm> mareko: we create MMOs for a few other intrinsics
> already, it should be the same
> 00:05 < mareko> if only I had time to play with LLVM
> 00:05 < arsenm> mareko: it looks like that is already done so it might
> be a more specific problem
> 00:05 < arsenm> that rematerializable scalar loads patch is probably
> OK now though
> 00:07 < arsenm> https://reviews.llvm.org/D11621
>
> Marek
>
>
> On Wed, Aug 30, 2017 at 1:58 AM, Marek Olšák <maraeo at gmail.com> wrote:
>> Interesting. It may be that glsl_to_tgsi uses copy propagation to fold
>> those CONST loads into operands, which puts them next to their uses in LLVM.
>>
>> I guess LLVM doesn't understand that s_buffer_load_dword loads from
>> immutable dereferenceable memory. It would benefit from mayLoad = 0 in
>> this case I think.
>>
>> Marek
>>
>> On Thu, Aug 24, 2017 at 11:48 AM, Timothy Arceri <tarceri at itsqueeze.com> wrote:
>>>
>>>
>>> On 24/08/17 18:12, Nicolai Hähnle wrote:
>>>>
>>>> On 24.08.2017 09:45, Timothy Arceri wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 22/08/17 22:14, Timothy Arceri wrote:
>>>>>>
>>>>>> I'm a little unsure what to do with this now. Below is my shader-db
>>>>>> results, the majority of negative changes are from Natural Selection
>>>>>> 2.
>>>>>>
>>>>>> I looked at some dumps of the worst Natural Selection 2 shaders and
>>>>>> it seems to just be scheduling differences causing the regressions.
>>>>>>
>>>>>> I tested with sisched but that just made things even worse.
>>>>>>
>>>>>> Obviously we should be aiming to improve the schedulare, but since
>>>>>> this regresses things and I have no evidence of it helping anything
>>>>>> it makes the case for adding it pretty weak.
>>>>>>
>>>>>> Thoughts??
>>>>>>
>>>>>> PERCENTAGE DELTAS    Shaders     SGPRs     VGPRs SpillSGPR  MaxWaves
>>>>>> --------------------------------------------------------------------
>>>>>>   All affected            5797    2.92     3.05 %    5.04 %   -2.94
>>>>>>   -------------------------------------------------------------------
>>>>>>   Total                  72287    0.28 %    0.34 %    0.33 %  -0.21 %
>>>>>>
>>>>>> _______________________________________________
>>>>>> mesa-dev mailing list
>>>>>> mesa-dev at lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>>>
>>>>>
>>>>>
>>>>> As far as I can tell this is because after this chnage we end up with
>>>>> large sections of consecutive loads. Any thoughts on avoid this?
>>>>
>>>>
>>>> Odd. Do you see the same change in TGSI?
>>>>
>>>> This is one of those things that ideally LLVM would be smart about, but
>>>> unfortunately it isn't really.
>>>
>>>
>>> Yeah I assume it's very doable since SSA makes this stuff reasonably easy to
>>> deal with. However I'm not really sure where to begin, or how welcome a pass
>>> to do this sorting would be. We have a similar pass in nir for moving
>>> comparisons to where they are first used.
>>>
>>> The TGSI is introduces an extra temp to store the value of the LOAD, this is
>>> probably what triggers the difference in LLVM.
>>>
>>> eg.
>>>
>>>  LOAD TEMP[61], UBO[2], IMM[2].yyyy
>>>  LOAD TEMP[62], UBO[2], IMM[1].zzzz
>>>  LOAD TEMP[63], UBO[2], IMM[1].wwww
>>>  LOAD TEMP[64], UBO[2], IMM[2].xxxx
>>>  DP4 TEMP[65].x, TEMP[60], TEMP[61]
>>>  DP4 TEMP[66].x, TEMP[60], TEMP[62]
>>>  MOV TEMP[65].y, TEMP[66].xxxx
>>>  DP4 TEMP[67].x, TEMP[60], TEMP[63]
>>>  MOV TEMP[65].z, TEMP[67].xxxx
>>>  DP4 TEMP[68].x, TEMP[60], TEMP[64]
>>>  MOV TEMP[69].w, TEMP[68].xxxx
>>>  MOV TEMP[69].xyz, TEMP[65].xyzx
>>>  LOAD TEMP[70], UBO[1], IMM[6].yyyy
>>>  LOAD TEMP[71], UBO[1], IMM[6].zzzz
>>>  DP4 TEMP[72].x, TEMP[69], TEMP[70]
>>>  DP4 TEMP[73].x, TEMP[69], TEMP[71]
>>>  LOAD TEMP[74], UBO[1], IMM[6].wwww
>>>  LOAD TEMP[75], UBO[1], IMM[7].xxxx
>>>  LOAD TEMP[76], UBO[1], IMM[7].yyyy
>>>  LOAD TEMP[77], UBO[1], IMM[7].zzzz
>>>  DP4 TEMP[78].x, TEMP[69], TEMP[74]
>>>  DP4 TEMP[79].x, TEMP[69], TEMP[75]
>>>  MOV TEMP[78].y, TEMP[79].xxxx
>>>  DP4 TEMP[80].x, TEMP[69], TEMP[76]
>>>  MOV TEMP[78].z, TEMP[80].xxxx
>>>  DP4 TEMP[81].x, TEMP[69], TEMP[77]
>>>  MOV TEMP[78].w, TEMP[81].xxxx
>>>
>>> vs
>>>
>>>  DP4 TEMP[63].x, TEMP[62], CONST[2][0]
>>>  DP4 TEMP[64].x, TEMP[62], CONST[2][1]
>>>  MOV TEMP[63].y, TEMP[64].xxxx
>>>  DP4 TEMP[65].x, TEMP[62], CONST[2][2]
>>>  MOV TEMP[63].z, TEMP[65].xxxx
>>>  DP4 TEMP[66].x, TEMP[62], CONST[2][3]
>>>  MOV TEMP[67].w, TEMP[66].xxxx
>>>  MOV TEMP[67].xyz, TEMP[63].xyzx
>>>  DP4 TEMP[68].x, TEMP[67], CONST[1][14]
>>>  DP4 TEMP[69].x, TEMP[67], CONST[1][15]
>>>  DP4 TEMP[70].x, TEMP[67], CONST[1][8]
>>>  DP4 TEMP[71].x, TEMP[67], CONST[1][9]
>>>  MOV TEMP[70].y, TEMP[71].xxxx
>>>  DP4 TEMP[72].x, TEMP[67], CONST[1][10]
>>>  MOV TEMP[70].z, TEMP[72].xxxx
>>>  DP4 TEMP[73].x, TEMP[67], CONST[1][11]
>>>  MOV TEMP[70].w, TEMP[73].xxxx
>>>  MOV TEMP[74].xyw, TEMP[70].xyxw
>>>
>>>>
>>>> Cheers,
>>>> Nicolai
>>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list