[Mesa-dev] [RFC PATCH 1/2] r600/sb: Set flags for GROUP_BARRIER instruction and force it into slot X

Wed Jan 10 21:20:42 UTC 2018

On Wed, Jan 10, 2018 at 3:50 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
> On Wed, Jan 10, 2018 at 3:27 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>> On Wed, Jan 10, 2018 at 3:13 PM, Gert Wollny <gw.fossdev at gmail.com> wrote:
>>> Am Mittwoch, den 10.01.2018, 16:36 +0100 schrieb Gert Wollny:
>>>> This seems to satisfy the sb optimizer, i.e. no regressions in the
>>>> piglits compared to disabling sb for tesselation shaders with
>>>> barriers but enabling them in general.
>>>> ---
>>>
>>> Actually, it seems this is not enough, at least for Tomb Raider which
>>> uses one tessellation control shader with a barrier. The optimizer
>>> reorders the LDS instructions around the barrier in a way that in the
>>> optimized version there are more reads before it than in the original
>>> byte code.
>>>
>>> The number of writes is the same though, and as far as I can tell from
>>> the TGSI, the values written to LDS before the barrier are not read
>>> back within the shader - which makes me wonder whether the barrier is
>>> actually necessary.
>>
>> If your hardware executes all the vertices in parallel, then a barrier
>> should be unnecessary.
>
> While this is true, you also need to be careful that reads after the
> barrier don't get reordered wrt any writes before the barrier --
> nouveau might be more conservative, but sb might be more aggressive
> here.

I thought about following up with that disclaimer. However... any
reordering that's not outright broken should be fine. I couldn't come
up with a viable counterexample. As long as a read on $other invoc's
output at location depends on write to *this* invoc's output at
location.

  -ilia