[Mesa-dev] [PATCH 2/2] r600g: simplify and fix flushing and synchronization v2

Thu Jul 19 19:06:35 PDT 2012

I actually care a lot about lockups. Well, you are complaing about
lockups, yet you have quite obvious bugs in your hyperz code, so let's
fix them first. (I wouldn't even try and run the hyperz code in its
current state. Please don't take that personally.) Then, if the
lockups persist, we can start looking into *what* fixes them. You seem
to think that this patch helps a lot, but you don't say why. Aren't
you interested in what sequence of GPU commands helps? If I am
counting correctly, there are 7 changes in behavior in this patch. It
should be pretty easy to nail down the few that help, document them
(like /* these two lines fix a lockup with hyperz */), and discard the
rest. The documenting part is very important, so that the other
developers won't break your code accidentally.

Marek

On Fri, Jul 20, 2012 at 3:12 AM, Jerome Glisse <j.glisse at gmail.com> wrote:
> On Thu, Jul 19, 2012 at 9:00 PM, Marek Olšák <maraeo at gmail.com> wrote:
>> On Fri, Jul 20, 2012 at 1:34 AM, Jerome Glisse <j.glisse at gmail.com> wrote:
>>> On Thu, Jul 19, 2012 at 6:07 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>> I have these issues with the patch:
>>>>
>>>> 1) On GPUs without a vertex cache, you flush the texture cache every
>>>> draw operation. Are you kidding?
>>>
>>> Show me one app with perf regression due to that ? Or just go look at
>>> what fglrx is doing.
>>
>> I don't believe that fglrx unconditionally emits SURFACE_SYNC with
>> TC_ACTION_ENA before every DRAW packet. I just don't buy that. It's
>> too stupid to be true. And considering that it wasn't needed before,
>> it's not needed now either. Please give me some other argument than
>> just fglrx.
>
> No fglrx don't set it for each draw, fglrx set it if a bunch of reg is
> touch. Given than right now we pretty much always touch one of those
> reg btw draw it just turn up that my patch trigger the flush btw each
> draw.
>
>>>
>>>> 2) All colorbuffers / streamout buffers are flushed, even those which
>>>> are not enabled. E.g. instead of flushing only CB0 when there is only
>>>> one, this code flushes all of them. Why? This either needs an
>>>> explanation or it should only flush the buffers which are enabled
>>>> (like the old code did).
>>>
>>> fglrx + no perf regression ...
>>
>> The "no perf regression" argument doesn't apply here, because it just
>> might not be the bottleneck now. I'm willing to step aside from this
>> one issue though.
>
> I am just trying to stick to fglrx pattern.
>
>
>>>
>>>> 3) Please explain:
>>>> - why you added PS_PARTIAL_FLUSH in r600_texture_barrier and
>>>> r600_set_framebuffer_state.
>>>
>>> fglrx is doing something similar
>>
>> But not exactly the same thing, right? So there's no reason for it to be there.
>
> It's hard to do as fglrx as the pattern is evading me no matter how
> much different app command stream i look at i always find an exception
> to rule i formulating.
>
>>>
>>>> - why you added CACHE_FLUSH_AND_INV_EVENT in set_framebuffer_state for
>>>> R700 and evergreen.
>>>
>>> fglrx ...
>>>
>>>> - why you applied the CB flush workarounds meant for RV6xx to all R600
>>>> and R700 chipsets.
>>>
>>> fglrx ...
>>>
>>>> - why the streamout workaround for RV6xx (S_0085F0_DEST_BASE_0_ENA) is
>>>> applied to all R600, R700, and evergreen chipsets.
>>>
>>> didn't hurt thought fglrx is not using that at all but i did not
>>> wanted to remove it
>>
>> Well, you didn't remove it. You added it for those other chipsets.
>> That's a difference. You don't even know what you did there, do you?
>> :) All the things I mentioned are either half-assed or added for no
>> reason. Fglrx might do all sorts of stupid things or for its own
>> reasons, but that doesn't mean it's automatically good for us. Besides
>> that, it's almost impossible to figure out why a CS was built up
>> exactly the way it was without access to the driver code and to its developers.
>
> Oh yeah i don't have fucking clue, i am fucking cluesless, i am just a
> fool that write fucking random line of code and have no fucking idea
> of what i am doing. Of course you know better, please enlight me.
>
> I am totaly on board with fglrx doing stupid things but yet it does
> not lockup ... so one of those stupid things is important and until
> someone figure which one i would rather do more stupid thing and not
> lockup then trying to pretend that flushing is a bottleneck with the
> driver right now.
>
>>>
>>>> - why R600_CONTEXT_FLUSH_AND_INV emits SURFACE_SYNC on evergreen,
>>>> resulting in emission of SURFACE_SYNC twice in a row in most
>>>> situations.
>>>
>>> fglrx is doing that and without that lockup ...
>>
>> Hm, now you're talking. So do you need:
>>
>> FLUSH_AND_INV +
>> SURFACE_SYNC (COHER_CNTL = ~0)
>>
>> or do you need:
>>
>> FLUSH_AND_INV +
>> SURFACE_SYNC (COHER_CNTL = ~0) +
>> SURFACE_SYNC (COHER_CNTL = according to flags)
>>
>> for it not to lock up?
>
> flush inv is always follow by surface sync with few exception (on
> which i am not clear but there is always a surface sync before a draw
> after a flush inv.
>
>>>
>>>> Flushing has always worked without all the changes (1, 2, 3) mentioned
>>>> above, so please if you don't have a reasonable explanation, revert to
>>>> the old behavior.
>>>
>>> Well if you have a better solution please show me ...
>>
>> I already showed you in the first reply. If you are unwilling to
>> change your patches even a little bit, I'll happily take them over
>> from you.
>>
>> Marek
>
> Oh i will change them, just not the way you like, i am trying to avoid
> lockup, you oubviously don't give a shit about that
>
> Cheers,
> Jerome