[Mesa-dev] [PATCH] loader/dri3: Overhaul dri3_update_num_back
Michel Dänzer
michel at daenzer.net
Fri Sep 2 01:06:33 UTC 2016
On 02/09/16 12:37 AM, Alex Deucher wrote:
> On Thu, Sep 1, 2016 at 11:28 AM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> On Aug 31, 2016 11:39 PM, "Michel Dänzer" <michel at daenzer.net> wrote:
>>> On 01/09/16 02:05 PM, Jason Ekstrand wrote:
>>>> On Wed, Aug 31, 2016 at 7:00 PM, Michel Dänzer <michel at daenzer.net
>>>> <mailto:michel at daenzer.net>> wrote:
>>>>
>>>> On 31/08/16 11:21 PM, Jason Ekstrand wrote:
>>>> > On Aug 19, 2016 12:07 AM, "Michel Dänzer" <michel at daenzer.net
>>>> <mailto:michel at daenzer.net>
>>>> > <mailto:michel at daenzer.net <mailto:michel at daenzer.net>>> wrote:
>>>> >> From: Michel Dänzer <michel.daenzer at amd.com
>>>> <mailto:michel.daenzer at amd.com>
>>>> > <mailto:michel.daenzer at amd.com <mailto:michel.daenzer at amd.com>>>
>>>> >>
>>>> >> Always use 3 buffers when flipping. With only 2 buffers, we have
>>>> to wait
>>>> >> for a flip to complete (which takes non-0 time even with
>>>> asynchronous
>>>> >> flips) before we can start working on the next frame. We were
>>>> previously
>>>> >> only using 2 buffers for flipping if the X server supports
>>>> asynchronous
>>>> >> flips, even when we're not using asynchronous flips. This could
>>>> result
>>>> >> in bad performance (the referenced bug report is an extreme case,
>>>> where
>>>> >> the inter-frame stalls were preventing the GPU from reaching its
>>>> maximum
>>>> >> clocks).
>>>> >
>>>> > Sorry for the post-push review but I don't usually pay much
>>>> attention to
>>>> > the window system code. In any case, I believe you're doing your
>>>> > counting wrong. When flipping with swapinterval=0, you need 4
>>>> buffers:
>>>> >
>>>> > 1. The buffer currently being scanned out (will be released at
>>>> next vblank)
>>>> > 2. The buffer X has queued for scanout but is waiting on vblank
>>>>
>>>> s/vblank/flip/g, since async flips may not wait for vblank, but
>>>> yeah.
>>>>
>>>> > 3. The buffer the application has just submitted which X will
>>>> queue next
>>>> > of it doesn't get another before the window closes.
>>>> > 4. The buffer the application is using for rendering.
>>>> >
>>>> > With only 3, you get a stall during that window in which X has
>>>> queued
>>>> > another flip but we're waiting on vblank before the flip begins.
>>>> An I
>>>> > missing something?
>>>>
>>>> Nothing, except maybe the paragraph below stating that I couldn't
>>>> measure any benefit from using 4 buffers. :) I'm not exactly sure
>>>> why,
>>>> but I suspect it might be because even with just 3 buffers, the GPU
>>>> can
>>>> always work on at least one frame ahead of time.
>>>>
>>>> Also note that even before my change, we were only using 3 buffers
>>>> when
>>>> the X driver supports async flips (with swap interval 0; only 2
>>>> buffers
>>>> with swap interval > 0).
>>>>
>>>>
>>>> Yes, because with async flips you don't have a buffer sitting queued in
>>>> the kernel waiting to be flipped which you can't cancel.
>>>
>>> Actually, there is. Even async flips take non-0 time to complete.
>>>
>>>
>>>> that makes perfect sense.
>>>
>>> What exactly does? My change may not be perfect, but the logic before it
>>> was mostly backwards.
>>
>> I think perhaps the problem here is that I don't know what you mean by
>> "async flips". It's an X term that obviously does not mean what I thought
>> it meant.
>
> Async means immediate (or as close to it as possibly, maybe hsync
> depending on the hw) not at vsync.
Exactly.
>>>> That said, I'd be interested in hearing about any test cases where 4
>>>> buffers provide a significant boost over 3.
>>>>
>>>>
>>>> A little history that may be useful: Quadbuffering was originally added
>>>> for DRI3+present here:
>>>>
>>>>
>>>> https://cgit.freedesktop.org/mesa/mesa/commit/?id=f7a355556ef5fe23056299a77414f9ad8b5e5a1d
>>>
>>> So the commit message claims. If you look at the code after that change
>>> though, it's basically impossible to end up with 4 buffers (at least
>>> permanently), since it would require all these conditions to be true at
>>> the same:
>>>
>>> 1. priv->flipping (the last Present request was completed as a flip)
>>> 2. !(priv->present_capabilities & XCB_PRESENT_CAPABILITY_ASYNC) (the X
>>> driver doesn't support async flips)
>>> 3. priv->swap_interval == 0
>>>
>>> Given 2, 1 & 3 are mutually exclusive.
>>
>> I'm not seeing how 1 & 3 are mutually exclusive. priv->swap_interval
>> doesn't seem to have anything to do with whether or not you're flipping.
priv->swap_interval == 0 can only use flips if async flips are supported.
So WRT https://bugs.freedesktop.org/show_bug.cgi?id=97549, let's not
jump to any conclusions but look at how many buffers actually end up
being used for what reasons in each case. I suspect there might be some
surprises. :)
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 163 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160902/4c2f087d/attachment.sig>
More information about the mesa-dev
mailing list