Buffer release events (was: Add support for eglSwapInterval)

Thu Oct 24 18:27:05 PDT 2013

This is mostly a recap of the discussion between Neil and myself on IRC
this morning.

I think a client can guarantee that it gets all of the buffer releases by
simply calling wl_display.sync after the attach/damage/commit.  It can even
delete the wl_callback object immediately without attaching a listener.
What follows is a revised version of the proof of this I gave on IRC this
morning.

In order for this to work, we need to make the assumption that the backend
holds at most N references to the buffer.  I will discuss the nature of
this number N later.  For the DRM backend with one output, N = 2.  Built
into this assumption is that if the backend already has N references it
will not take another one without releasing one of the ones it already has.

Seeing that this always works comes down to observing that at any given
time we either have enough free buffers or the sync event will flush out at
least one release event.  Most of the time attach/commit will result in a
release on the previously attached buffer.  If it does, the release event
will get queued and the sync event will flush the queue.  The two cases in
which it does not release the previously attached buffer directly are if
the buffer has already been released in compositor_accumulate_damage or if
it has been referenced by the backend.  In the first case the release
happened before attach/commit, so it will be received before the sync
event.  In the second case, the backend has either referenced less than N
buffers and the client has extra free buffers or the backend had to release
a buffer before it could reference a new one.  This release must have been
queued before the attach/commit so it will get flushed by the sync.

Given that we know the number N associated with the backend, a client can
continuously render with as few as N+2 buffers (or N+3 for subsurfaces).
It needs N for the backend, one to be currently attached and one to
render.  If it is part of a subsurface that is in "synced" mode, there may
be an additional reference tied up in the commit cache.  It's worth noting
that using N+2 buffers does require a full-blown roundtrip after
attach/commit.  However, given that you are probably doing SwapInterval(0)
in this case and running the GPU hard, that shouldn't be a bottlekneck.

Now a note about this number N.  In the DRM backend with a single output, N
= 2.  For a fullscreen surface, one buffer is used for scanout and the
other is tied up in the queued pageflip (which can't be canceled).  The DRM
backend will only take a reference to the currently attached buffer if it
is going to queue it for the next pageflip.  However, this implies that a
pageflip has happened, freeing the previous scanout buffer.  There may be a
subtle issue here if the surface is displayed fullscreen on multiple
outputs.  Mmy knowledge of DRM is a bit lacking here, but it seems like you
could end up with 2 buffers tied up for each output.  Perhaps we should
look into synchronizing pageflips in some way so we use less buffers?

Thanks,
--Jason Ekstrand

On Thu, Oct 24, 2013 at 2:34 PM, Jonas Ådahl <jadahl at gmail.com> wrote:

> On Thu, Oct 24, 2013 at 11:26:08AM +0100, Neil Roberts wrote:
> > Hi
> >
> > Thanks for the interesting insights.
> >
> > > It seems to me as if the default should always be to just send the
> > > event.
> >
> > I think I would vote for leaving the default as it is, ie, queuing the
> > release events. It's really quite a corner case that delaying events has
> > any effect on an application because most applications don't need to
> > know about the release events until they are about to draw something.
> > Usually they would only draw something in response to some event such as
> > a frame callback or an input event. In that case the event will have
> > caused the queue to flush so they will certainly be up-to-date about
> > what buffers are available at the point when they start drawing. If we
> > default to not queuing the event then I'd imagine most applications
> > wouldn't realise they should enable it and would miss out on the
> > optimisation.
> >
> > > We can identify buffer release events in weston as coming from one of
> > > three sources:
> > >
> > > 1) wl_surface.commit
> > > 2) surface_flush_damage (the gl renderer releases SHM buffers here)
> > > 3) random releases from the backdend/renderer
> > >
> > > Number 2 above happens during the redraw loop so we can just post the
> > > event and won't get a double-wakeup.
> >
> > Yes, I guess even if the compositor posts the event it's not going to
> > actually send it to the client until the compositor goes idle again
> > anyway and at that point it will probably have posted a frame complete
> > callback too so the client would wake up anyway.
> >
> > > Number 3 is something we can't really control; I'd personally lean
> > > towards posting the event here, but it's probably at most one
> > > reference per surface so we can probably get away with queueing.
> > > (Also, if the backend knows when it would release in the render cycle,
> > > it may be able to optimize this better than some compositor-general
> > > solution.) For these two, we can add an argument to
> > > weston_buffer_reference to set the release event type.
> >
> > I think case number 3 is the main problem. It's useful for most
> > fullscreen apps to have the event queued because most of them will be
> > throttled to the frame callback and don't need the release events
> > immediately. However this is also the use case most likely to want
> > eglSwapInterval(0) which would want them immediately so really for this
> > situation it is an application choice whether they should be queued or
> > not.
> >
> > > Number 1 above is the source of the vast majority of out release
> > > events. [...] The good news is that we can, from a client perspective,
> > > deal with this one easily. The solution is to send a wl_display.sync
> > > request immediately after the commit.
> >
> > Yes, I think it makes sense to always sync the rendering to at least a
> > wl_display.sync call and the Mesa patch I sent does already do this. You
> > are right that in practice this effectively solves the problem for most
> > use cases. So really the only case where this matters is when the
> > compositor is directly scanning out from the client's buffer. But on the
> > other hand, that is exactly what a fullscreen game is likely to be doing
> > and that is the most likely candidate for doing eglSwapInterval(0).
> >
> > > In any case, dummy sync and frame requests (you may need both) will
> allow
> > > you to achieve glSwapInterval(0) without server-side support.
> >
> > I'm not sure I follow you here. The release event may be queued at any
> > point after the frame complete is sent. In that case sending a sync
> > event to flush the queue is only going to help if Mesa sends it
> > repeatedly, but that amounts to busy-waiting which would be terrible.
> >
> > I still feel like the new request is the right way to go. The difficulty
> > with interface versioning feels like a separate wider problem that we
> > should think about. The crux of the problem is that Mesa probably
> > shouldn't be using proxy objects that are created outside of Mesa
> > because in that case it doesn't have control over what interface version
> > or event queue it is using. Working around the need for the new request
> > would just side-step the issue but it doesn't seem unlikely that Mesa
> > would want to use further new surface interfaces requests later too and
> > the same problem would come back.
> >
> > Maybe we should have a separate object that you can create per surface
> > in order to do new requests. This could be created by a new global
> > object much like the wl_shell interface. In order to make it usable to
> > both Mesa and the application, we would have to allow creating multiple
> > resources of the new interface for a single surface. I'm not sure what
> > to call it though because it would just end up being something like
> > ‘wl_surface_extra_stuff’. Considering that other objects may end up also
> > needing a similar kind of interface, maybe it would make more sense to
> > rethink it a bit and make the compositor allow multiple resources for an
> > object in general. Then you could have something like
> > wl_compositor.rebind_resource(object, version) which would make a new
> > resource for any object and it could have its own interface version. I
> > am just thinking aloud here though, I haven't really thought that
> > through much.
> >
> > I will take a look at how much hassle it would be to get Weston to allow
> > multiple resources per surface.
>
> One idea that I haven't thought through that much as well is to add some
> way for mesa to get the "actual" version of some object it was provided.
> This could be done by adding a new global to put a "get_object_version"
> request that mesa could get its own object for. Then it would need to be
> specified that when a compositor got requested for an object of version
> N, it must need to be able to handle requests for newer versions than N
> as well. All this, assuming the opcodes for newer interfaces never
> conflict with old ones, one could just send requests using opcodes
> provided by the -protocol.h file mesa was built with, and the compositor
> should react accordingly. One could see it as something similar to
> instanceof on and casting the already provided object.
>
> Jonas
>
> >
> > Regards,
> > - Neil
> > _______________________________________________
> > wayland-devel mailing list
> > wayland-devel at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/wayland-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/wayland-devel/attachments/20131024/d48e3553/attachment-0001.html>