[Mesa-dev] [PATCH] loader/dri3: Overhaul dri3_update_num_back

Thu Sep 1 05:05:25 UTC 2016

On Wed, Aug 31, 2016 at 7:00 PM, Michel Dänzer <michel at daenzer.net> wrote:

> On 31/08/16 11:21 PM, Jason Ekstrand wrote:
> > On Aug 19, 2016 12:07 AM, "Michel Dänzer" <michel at daenzer.net
> > <mailto:michel at daenzer.net>> wrote:
> >> From: Michel Dänzer <michel.daenzer at amd.com
> > <mailto:michel.daenzer at amd.com>>
> >>
> >> Always use 3 buffers when flipping. With only 2 buffers, we have to wait
> >> for a flip to complete (which takes non-0 time even with asynchronous
> >> flips) before we can start working on the next frame. We were previously
> >> only using 2 buffers for flipping if the X server supports asynchronous
> >> flips, even when we're not using asynchronous flips. This could result
> >> in bad performance (the referenced bug report is an extreme case, where
> >> the inter-frame stalls were preventing the GPU from reaching its maximum
> >> clocks).
> >
> > Sorry for the post-push review but I don't usually pay much attention to
> > the window system code.  In any case, I believe you're doing your
> > counting wrong.  When flipping with swapinterval=0, you need 4 buffers:
> >
> > 1. The buffer currently being scanned out  (will be released at next
> vblank)
> > 2. The buffer X has queued for scanout but is waiting on vblank
>
> s/vblank/flip/g, since async flips may not wait for vblank, but yeah.
>
> > 3. The buffer the application has just submitted which X will queue next
> > of it doesn't get another before the window closes.
> > 4. The buffer the application is using for rendering.
> >
> > With only 3, you get a stall during that window in which X has queued
> > another flip but we're waiting on vblank before the flip begins. An I
> > missing something?
>
> Nothing, except maybe the paragraph below stating that I couldn't
> measure any benefit from using 4 buffers. :) I'm not exactly sure why,
> but I suspect it might be because even with just 3 buffers, the GPU can
> always work on at least one frame ahead of time.
>
> Also note that even before my change, we were only using 3 buffers when
> the X driver supports async flips (with swap interval 0; only 2 buffers
> with swap interval > 0).
>

Yes, because with async flips you don't have a buffer sitting queued in the
kernel waiting to be flipped which you can't cancel.  that makes perfect
sense.

> That said, I'd be interested in hearing about any test cases where 4
> buffers provide a significant boost over 3.
>

A little history that may be useful: Quadbuffering was originally added for
DRI3+present here:

https://cgit.freedesktop.org/mesa/mesa/commit/?id=f7a355556ef5fe23056299a77414f9ad8b5e5a1d

In Wayland, the change was made here

https://cgit.freedesktop.org/mesa/mesa/commit/?id=992a2dbba80aba35efe83202e1013bd6143f0dba

Unfortunately, neither of those specify precise metrics.  Eero's bug had
some very concrete numbers.  Hopefully he can provide you with the details
you need for further analysis.

>
> >> I couldn't measure any performance boost using 4 buffers with flipping.
> >> Performance actually seemed to go down slightly, but that might have
> >> been just noise.
>
>
> --
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160831/953328f8/attachment.html>