[Mesa-dev] V2 radeonsi use STD430 packing of UBOs by default

Timothy Arceri tarceri at itsqueeze.com
Wed Aug 30 12:22:19 UTC 2017


On 30/08/17 20:07, Marek Olšák wrote:
> If LLVM was fixed to do the correct thing, we could enable CONSTBUF
> LOAD for LLVM 6.0 and later.

You seem to think that the compiler *should* be placing them near where 
they are used? What part of LLVM were you expecting to do this? I'm 
happy to do some digging around but don't know where I should start looking.

> 
> Marek
> 
> On Wed, Aug 30, 2017 at 9:18 AM, Timothy Arceri <tarceri at itsqueeze.com> wrote:
>> On 30/08/17 10:25, Marek Olšák wrote:
>>>
>>> I have to conclude that I don't see a way to use LOAD with CONSTBUF
>>> and keep the same performance as before. It looks like there are some
>>> deficiencies in our compiler stack that are unfixable in Mesa alone.
>>
>>
>> Well that's frustrating :( Pretty much makes finishing off uniform packing
>> [1] pointless. Besides an issue with matrices and some tidy ups it was
>> mostly done.
>>
>> [1] https://github.com/tarceri/Mesa/compare/uniform_packing5
>>
>>
>>>
>>> Marek
>>>
>>> On Wed, Aug 30, 2017 at 2:11 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>
>>>> Related IRC discussion:
>>>>
>>>> 00:01 < mareko> arsenm: what are the chances I can convince you to
>>>> allow me to set mayLoad = 0 on s_buffer_load_dword? :) the instruction
>>>> always reads from read-only memory with Mesa
>>>> 00:02 < mareko> apparently, readnone doesn't get through
>>>> 00:02 < arsenm> mareko: you should get the same effect by having
>>>> invariant on the MMO
>>>> 00:03 < mareko> arsenm: and how would I set invariant on SI.load.const?
>>>> 00:04 < arsenm> mareko: we create MMOs for a few other intrinsics
>>>> already, it should be the same
>>>> 00:05 < mareko> if only I had time to play with LLVM
>>>> 00:05 < arsenm> mareko: it looks like that is already done so it might
>>>> be a more specific problem
>>>> 00:05 < arsenm> that rematerializable scalar loads patch is probably
>>>> OK now though
>>>> 00:07 < arsenm> https://reviews.llvm.org/D11621
>>>>
>>>> Marek
>>>>
>>>>
>>>> On Wed, Aug 30, 2017 at 1:58 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>
>>>>> Interesting. It may be that glsl_to_tgsi uses copy propagation to fold
>>>>> those CONST loads into operands, which puts them next to their uses in
>>>>> LLVM.
>>>>>
>>>>> I guess LLVM doesn't understand that s_buffer_load_dword loads from
>>>>> immutable dereferenceable memory. It would benefit from mayLoad = 0 in
>>>>> this case I think.
>>>>>
>>>>> Marek
>>>>>
>>>>> On Thu, Aug 24, 2017 at 11:48 AM, Timothy Arceri <tarceri at itsqueeze.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 24/08/17 18:12, Nicolai Hähnle wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 24.08.2017 09:45, Timothy Arceri wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 22/08/17 22:14, Timothy Arceri wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm a little unsure what to do with this now. Below is my shader-db
>>>>>>>>> results, the majority of negative changes are from Natural Selection
>>>>>>>>> 2.
>>>>>>>>>
>>>>>>>>> I looked at some dumps of the worst Natural Selection 2 shaders and
>>>>>>>>> it seems to just be scheduling differences causing the regressions.
>>>>>>>>>
>>>>>>>>> I tested with sisched but that just made things even worse.
>>>>>>>>>
>>>>>>>>> Obviously we should be aiming to improve the schedulare, but since
>>>>>>>>> this regresses things and I have no evidence of it helping anything
>>>>>>>>> it makes the case for adding it pretty weak.
>>>>>>>>>
>>>>>>>>> Thoughts??
>>>>>>>>>
>>>>>>>>> PERCENTAGE DELTAS    Shaders     SGPRs     VGPRs SpillSGPR  MaxWaves
>>>>>>>>> --------------------------------------------------------------------
>>>>>>>>>     All affected            5797    2.92     3.05 %    5.04 %   -2.94
>>>>>>>>>
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>     Total                  72287    0.28 %    0.34 %    0.33 %  -0.21
>>>>>>>>> %
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> mesa-dev mailing list
>>>>>>>>> mesa-dev at lists.freedesktop.org
>>>>>>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> As far as I can tell this is because after this chnage we end up with
>>>>>>>> large sections of consecutive loads. Any thoughts on avoid this?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Odd. Do you see the same change in TGSI?
>>>>>>>
>>>>>>> This is one of those things that ideally LLVM would be smart about,
>>>>>>> but
>>>>>>> unfortunately it isn't really.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Yeah I assume it's very doable since SSA makes this stuff reasonably
>>>>>> easy to
>>>>>> deal with. However I'm not really sure where to begin, or how welcome a
>>>>>> pass
>>>>>> to do this sorting would be. We have a similar pass in nir for moving
>>>>>> comparisons to where they are first used.
>>>>>>
>>>>>> The TGSI is introduces an extra temp to store the value of the LOAD,
>>>>>> this is
>>>>>> probably what triggers the difference in LLVM.
>>>>>>
>>>>>> eg.
>>>>>>
>>>>>>    LOAD TEMP[61], UBO[2], IMM[2].yyyy
>>>>>>    LOAD TEMP[62], UBO[2], IMM[1].zzzz
>>>>>>    LOAD TEMP[63], UBO[2], IMM[1].wwww
>>>>>>    LOAD TEMP[64], UBO[2], IMM[2].xxxx
>>>>>>    DP4 TEMP[65].x, TEMP[60], TEMP[61]
>>>>>>    DP4 TEMP[66].x, TEMP[60], TEMP[62]
>>>>>>    MOV TEMP[65].y, TEMP[66].xxxx
>>>>>>    DP4 TEMP[67].x, TEMP[60], TEMP[63]
>>>>>>    MOV TEMP[65].z, TEMP[67].xxxx
>>>>>>    DP4 TEMP[68].x, TEMP[60], TEMP[64]
>>>>>>    MOV TEMP[69].w, TEMP[68].xxxx
>>>>>>    MOV TEMP[69].xyz, TEMP[65].xyzx
>>>>>>    LOAD TEMP[70], UBO[1], IMM[6].yyyy
>>>>>>    LOAD TEMP[71], UBO[1], IMM[6].zzzz
>>>>>>    DP4 TEMP[72].x, TEMP[69], TEMP[70]
>>>>>>    DP4 TEMP[73].x, TEMP[69], TEMP[71]
>>>>>>    LOAD TEMP[74], UBO[1], IMM[6].wwww
>>>>>>    LOAD TEMP[75], UBO[1], IMM[7].xxxx
>>>>>>    LOAD TEMP[76], UBO[1], IMM[7].yyyy
>>>>>>    LOAD TEMP[77], UBO[1], IMM[7].zzzz
>>>>>>    DP4 TEMP[78].x, TEMP[69], TEMP[74]
>>>>>>    DP4 TEMP[79].x, TEMP[69], TEMP[75]
>>>>>>    MOV TEMP[78].y, TEMP[79].xxxx
>>>>>>    DP4 TEMP[80].x, TEMP[69], TEMP[76]
>>>>>>    MOV TEMP[78].z, TEMP[80].xxxx
>>>>>>    DP4 TEMP[81].x, TEMP[69], TEMP[77]
>>>>>>    MOV TEMP[78].w, TEMP[81].xxxx
>>>>>>
>>>>>> vs
>>>>>>
>>>>>>    DP4 TEMP[63].x, TEMP[62], CONST[2][0]
>>>>>>    DP4 TEMP[64].x, TEMP[62], CONST[2][1]
>>>>>>    MOV TEMP[63].y, TEMP[64].xxxx
>>>>>>    DP4 TEMP[65].x, TEMP[62], CONST[2][2]
>>>>>>    MOV TEMP[63].z, TEMP[65].xxxx
>>>>>>    DP4 TEMP[66].x, TEMP[62], CONST[2][3]
>>>>>>    MOV TEMP[67].w, TEMP[66].xxxx
>>>>>>    MOV TEMP[67].xyz, TEMP[63].xyzx
>>>>>>    DP4 TEMP[68].x, TEMP[67], CONST[1][14]
>>>>>>    DP4 TEMP[69].x, TEMP[67], CONST[1][15]
>>>>>>    DP4 TEMP[70].x, TEMP[67], CONST[1][8]
>>>>>>    DP4 TEMP[71].x, TEMP[67], CONST[1][9]
>>>>>>    MOV TEMP[70].y, TEMP[71].xxxx
>>>>>>    DP4 TEMP[72].x, TEMP[67], CONST[1][10]
>>>>>>    MOV TEMP[70].z, TEMP[72].xxxx
>>>>>>    DP4 TEMP[73].x, TEMP[67], CONST[1][11]
>>>>>>    MOV TEMP[70].w, TEMP[73].xxxx
>>>>>>    MOV TEMP[74].xyw, TEMP[70].xyxw
>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Nicolai
>>>>>>>
>>>>>> _______________________________________________
>>>>>> mesa-dev mailing list
>>>>>> mesa-dev at lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list