[RFC v2] Wayland presentation extension (video protocol)

Mon Feb 10 08:29:15 PST 2014

On 10/02/2014, Jason Ekstrand wrote :
>
>
>
> On Mon, Feb 10, 2014 at 3:53 AM, Pekka Paalanen <ppaalanen at gmail.com 
> <mailto:ppaalanen at gmail.com>> wrote:
>
>     On Sat, 8 Feb 2014 15:23:29 -0600
>     Jason Ekstrand <jason at jlekstrand.net
>     <mailto:jason at jlekstrand.net>> wrote:
>
>     > Pekka,
>     > First off, I think you've done a great job over-all.  I think it
>     will both
>     > cover most cases and work well  I've got a few comments below.
>
>     Thank you for the review. :-)
>     Replies below.
>
>     > On Thu, Jan 30, 2014 at 9:35 AM, Pekka Paalanen
>     <ppaalanen at gmail.com <mailto:ppaalanen at gmail.com>> wrote:
>     >
>     > > Hi,
>     > >
>     > > it's time for a take two on the Wayland presentation extension.
>     > >
>     > >
>     > >                 1. Introduction
>     > >
>     > > The v1 proposal is here:
>     > >
>     > >
>     http://lists.freedesktop.org/archives/wayland-devel/2013-October/011496.html
>     > >
>     > > In v2 the basic idea is the same: you can queue frames with a
>     > > target presentation time, and you can get accurate presentation
>     > > feedback. All the details are new, though. The re-design started
>     > > from the wish to handle resizing better, preferably without
>     > > clearing the buffer queue.
>     > >
>     > > All the changed details are probably too much to describe here,
>     > > so it is maybe better to look at this as a new proposal. It
>     > > still does build on Frederic's work, and everyone who commented
>     > > on it. Special thanks to Axel Davy for his counter-proposal and
>     > > fighting with me on IRC. :-)
>     > >
>     > > Some highlights:
>     > >
>     > > - Accurate presentation feedback is possible also without
>     > >   queueing.
>     > >
>     > > - You can queue also EGL-based rendering, and get presentation
>     > >   feedback if you want. Also EGL can do this internally, too, as
>     > >   long as EGL and the app do not try to use queueing at the
>     same time.
>     > >
>     > > - More detailed presentation feedback to better allow predicting
>     > >   future display refreshes.
>     > >
>     > > - If wl_viewport is used, neither video resolution changes nor
>     > >   surface (window) size changes alone require clearing the queue.
>     > >   Video can continue playing even during resizes.
>     ...
>     > >   <interface name="presentation" version="1">
>     > >     <description summary="timed presentation related
>     wl_surface requests">
>     > >       The main features of this interface are accurate
>     presentation
>     > >       timing feedback, and queued wl_surface content updates
>     to ensure
>     > >       smooth video playback while maintaining audio/video
>     > >       synchronization. Some features use the concept of a
>     presentation
>     > >       clock, which is defined in presentation.clock_id event.
>     > >
>     > >       Requests 'feedback' and 'queue' can be regarded as
>     additional
>     > >       wl_surface methods. They are part of the double-buffered
>     > >       surface state update mechanism, where other requests
>     first set
>     > >       up the state and then wl_surface.commit atomically
>     applies the
>     > >       state into use. In other words, wl_surface.commit submits a
>     > >       content update.
>     > >
>     > >       Interface wl_surface has requests to set surface related
>     state
>     > >       and buffer related state, because there is no separate
>     interface
>     > >       for buffer state alone. Queueing requires separating the
>     surface
>     > >       from buffer state, and buffer state can be queued while
>     surface
>     > >       state cannot.
>     > >
>     > >       Buffer state includes the wl_buffer from
>     wl_surface.attach, the
>     > >       state assigned by wl_surface requests frame,
>     > >       set_buffer_transform and set_buffer_scale, and any
>     > >       buffer-related state from extensions, for instance
>     > >       wl_viewport.set_source. This state is inherent to the buffer
>     > >       and the content update, rather than the surface.
>     > >
>     > >       Surface state includes all other state associated with
>     > >       wl_surfaces, like the x,y arguments of
>     wl_surface.attach, input
>     > >       and opaque regions, damage, and extension state like
>     > >       wl_viewport.destination. In general, anything expressed in
>     > >       surface local coordinates is better as surface state.
>     > >
>     > >       The standard way of posting new content to a surface
>     using the
>     > >       wl_surface requests damage, attach, and commit is called
>     > >       immediate content submission. This happens when a
>     > >       presentation.queue request has not been sent since the last
>     > >       wl_surface.commit.
>     > >
>     > >       The new way of posting a content update is a queued content
>     > >       update submission. This happens on a wl_surface.commit
>     when a
>     > >       presentation.queue request has been sent since the last
>     > >       wl_surface.commit.
>     > >
>     > >       Queued content updates do not get applied immediately in the
>     > >       compositor but are pushed to a queue on receiving the
>     > >       wl_surface.commit. The queue is ordered by the
>     submission target
>     > >       timestamp. Each item in the queue contains the
>     wl_buffer, the
>     > >       target timestamp, and all the buffer state as defined
>     above. All
>     > >       the queued state is taken from the pending wl_surface
>     state at
>     > >       the time of the commit, exactly like an immediate commit
>     would
>     > >       have taken it.
>     > >
>     > >       For instance on a queueing commit, the pending buffer is
>     queued
>     > >       and no buffer is pending afterwards. The stored values
>     of the
>     > >       x,y parameters of wl_surface.attach are reset to zero,
>     but they
>     > >       also are not queued; queued content updates do not carry the
>     > >       attach offsets. All other surface state (that is not
>     queued),
>     > >       e.g. damage, is not applied nor reset.
>     > >
>     > >       Issuing a queueing commit without a wl_surface.attach is
>     > >       undefined. However, queueing a commit with explicitly
>     attached
>     > >       NULL wl_buffer works; when and if the content update is
>     > >       executed, the surface content is removed as defined for
>     > >       wl_surface.attach.
>     > >
>     > >       If a queued content update has been submitted, and the
>     wl_buffer
>     > >       used in the update is destroyed before the wl_buffer.release
>     > >       event, the results are undefined. The compositor may or
>     may not
>     > >       have executed the update, therefore the surface contents
>     become
>     > >       undefined as explained in wl_surface.attach. Whether any
>     > >       presentation feedback or frame callbacks occur is undefined.
>     > >
>     > >       For each surface, the compositor maintains an
>     association to a
>     > >       single output that is considered as the main output for the
>     > >       surface. Queued content updates are synchronized to the
>     > >       surface's main output, to provide a consistent and
>     meaningful
>     > >       definition of the moment the update is displayed to the
>     user.
>     > >       When a compositor updates an output, it processes only the
>     > >       queues of the surfaces whose main output is the one being
>     > >       updated. The queues of other surfaces, even if they are
>     part of
>     > >       the redrawing, are not processed at that time.
>     > >
>     > >       When a compositor chooses to update an output, it must
>     predict
>     > >       the presentation clock value when the display update
>     will occur.
>     > >       For the definition of the moment of display update, see
>     > >       presentation_feedback.presented. Therefore if the
>     prediction is
>     > >       absolutely perfect, presentation_feedback.presented will
>     carry
>     > >       the same clock value.
>     > >
>     > >       For each surface with queued content updates and
>     matching main
>     > >       output, the compositor picks the update with the highest
>     > >       timestamp no later than a half frame period after the
>     predicted
>     > >       presentation time. The intent is to pick the content update
>     > >       whose target timestamp as rounded to the output refresh
>     period
>     > >       granularity matches the same display update as the
>     compositor is
>     > >       targeting, while not displaying any content update more
>     than a
>     > >
>     >
>     > I'm not really following 100% here. It's not your fault, this is
>     just a
>     > terribly awkward sort of thing to try and put into English.  It
>     sounds to
>     > me like the following: If P0 is the time of the next present and
>     P1 is the
>     > time of the one after that, you look for the largest thing less
>     than the
>     > average of P1 and P2.  Is this correct?  Why go for the average?
>      The
>     > client is going to have to adjust anyway.
>     >
>     >
>     > >       half frame period too early. If all the updates in the
>     queue are
>     > >       already late, the highest timestamp update is taken
>     regardless
>     > >       of how late it is. Once an update in a queue has been
>     chosen,
>     > >       all remaining updates with an earlier timestamp in the
>     queue are
>     > >       discarded.
>     > >
>     >
>     > Ok, I think what you are saying works.  Again, it's difficult to
>     parse but
>     > these things always are.
>     >
>
>     Yes, it is hard to write a generic algorithm in English. Axel did a
>     nice job clarifying it. I hope I can improve on the language after I
>     have actually implemented this and any possible changes we need to
>     this.
>
>     Also, the inline documentation in the XML file is getting a bit out of
>     hand, lacking in expressional power. I would have liked to use
>     sub-headings, the algorithm could use pseudo-code, etc, but they just
>     don't really exist here. Yet, I want these things to be part of the
>     protocol spec, so the semantics of the protocol get properly defined.
>
>     > >         4.5. The frame callback and swap interval
>     > >
>     > > The frame callback needs to be with the buffer state, so it gets
>     > > queued. If a client makes e.g. EGL's commits queued, EGL may
>     > > still rely on frame callbacks for blocking apps properly, and
>     > > that is related to presenting the buffer, not just the very next
>     > > output refresh. EGL may also internally use queueing and
>     > > feedback to implement swap interval > 1.
>     > >
>     >
>     > Doesn't this mean that you need eglSwapInterval(0) if you're
>     queueing?
>     > This is probably the case anyway, but it might be worth noting
>     explicitly.
>     > I think what you're doing with frame callbacks is sane, but I'm
>     not sure.
>
>     Yeah, swapinterval zero is needed indeed. Personally I would be more
>     worried about whether an EGL implementation agrees to allocate new
>     buffers if the app is queueing in advance. I suspect queueing many
>     frames in advance won't work with EGL in practice.
>
>     But you can still queue a frame at a time, that might be enough for
>     e.g. GL-based video players under good conditions. That might not need
>     swapinterval zero, either.
>
>     > My one latent concern is that I still don't think we're entirely
>     handling
>     > the case that QtQuick wants.  What they want is to do their
>     rendering a few
>     > frames in advance in case of CPU/GPU jitter.  Technically, this
>     extension
>     > handles this by the client simply doing a good job of guessing
>     presentation
>     > times on a one-per-frame baseis.  However, it doesn't allow for
>     any damage
>     > tracking.  In the case of QtQuick they want a linear queue of
>     buffers where
>     > no buffer ever gets skipped.  In this case, you could do damage
>     tracking by
>     > allowing it to accumulate from one frame to another and you get
>     all of the
>     > damage-tracking advantages that you had before.  I'm not sure
>     how much this
>     > matters, but it might be worth thinking about it.
>
>     Does it really want to display *every* frame regardless of time? It
>     doesn't matter that if a deadline is missed, the animation slows down
>     rather than jumps to keep up with intended velocity?
>
>
> That is my understanding of how it works now.  I *think* they figure 
> the compositor isn't the bottle-kneck and that it will git its 60 FPS. 
>  That said, I don't actually work on QtQuick.  I'm just trying to make 
> sure they don't get completely left out in the cold.
>
>
>     Axel has a good point, cannot this be just done client side and
>     immediate updates based on frame callbacks?
>
>
> Probably not.  They're using GLES and EGL  so they can't draw early 
> and just stash the buffer.
That's not a problem.
They can render to a fbo linked to an EGLImage, and we can get a 
wl_buffer from an EGLImage.

Axel Davy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/wayland-devel/attachments/20140210/7b003bf1/attachment-0001.html>