<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Feb 10, 2014 at 3:53 AM, Pekka Paalanen <span dir="ltr"><<a href="mailto:ppaalanen@gmail.com" target="_blank">ppaalanen@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">On Sat, 8 Feb 2014 15:23:29 -0600<br>
Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br>
<br>
> Pekka,<br>
> First off, I think you've done a great job over-all. I think it will both<br>
> cover most cases and work well I've got a few comments below.<br>
<br>
</div>Thank you for the review. :-)<br>
Replies below.<br>
<div><div class="h5"><br>
> On Thu, Jan 30, 2014 at 9:35 AM, Pekka Paalanen <<a href="mailto:ppaalanen@gmail.com">ppaalanen@gmail.com</a>> wrote:<br>
><br>
> > Hi,<br>
> ><br>
> > it's time for a take two on the Wayland presentation extension.<br>
> ><br>
> ><br>
> > 1. Introduction<br>
> ><br>
> > The v1 proposal is here:<br>
> ><br>
> > <a href="http://lists.freedesktop.org/archives/wayland-devel/2013-October/011496.html" target="_blank">http://lists.freedesktop.org/archives/wayland-devel/2013-October/011496.html</a><br>
> ><br>
> > In v2 the basic idea is the same: you can queue frames with a<br>
> > target presentation time, and you can get accurate presentation<br>
> > feedback. All the details are new, though. The re-design started<br>
> > from the wish to handle resizing better, preferably without<br>
> > clearing the buffer queue.<br>
> ><br>
> > All the changed details are probably too much to describe here,<br>
> > so it is maybe better to look at this as a new proposal. It<br>
> > still does build on Frederic's work, and everyone who commented<br>
> > on it. Special thanks to Axel Davy for his counter-proposal and<br>
> > fighting with me on IRC. :-)<br>
> ><br>
> > Some highlights:<br>
> ><br>
> > - Accurate presentation feedback is possible also without<br>
> > queueing.<br>
> ><br>
> > - You can queue also EGL-based rendering, and get presentation<br>
> > feedback if you want. Also EGL can do this internally, too, as<br>
> > long as EGL and the app do not try to use queueing at the same time.<br>
> ><br>
> > - More detailed presentation feedback to better allow predicting<br>
> > future display refreshes.<br>
> ><br>
> > - If wl_viewport is used, neither video resolution changes nor<br>
> > surface (window) size changes alone require clearing the queue.<br>
> > Video can continue playing even during resizes.<br>
</div></div>...<br>
<div><div class="h5">> > <interface name="presentation" version="1"><br>
> > <description summary="timed presentation related wl_surface requests"><br>
> > The main features of this interface are accurate presentation<br>
> > timing feedback, and queued wl_surface content updates to ensure<br>
> > smooth video playback while maintaining audio/video<br>
> > synchronization. Some features use the concept of a presentation<br>
> > clock, which is defined in presentation.clock_id event.<br>
> ><br>
> > Requests 'feedback' and 'queue' can be regarded as additional<br>
> > wl_surface methods. They are part of the double-buffered<br>
> > surface state update mechanism, where other requests first set<br>
> > up the state and then wl_surface.commit atomically applies the<br>
> > state into use. In other words, wl_surface.commit submits a<br>
> > content update.<br>
> ><br>
> > Interface wl_surface has requests to set surface related state<br>
> > and buffer related state, because there is no separate interface<br>
> > for buffer state alone. Queueing requires separating the surface<br>
> > from buffer state, and buffer state can be queued while surface<br>
> > state cannot.<br>
> ><br>
> > Buffer state includes the wl_buffer from wl_surface.attach, the<br>
> > state assigned by wl_surface requests frame,<br>
> > set_buffer_transform and set_buffer_scale, and any<br>
> > buffer-related state from extensions, for instance<br>
> > wl_viewport.set_source. This state is inherent to the buffer<br>
> > and the content update, rather than the surface.<br>
> ><br>
> > Surface state includes all other state associated with<br>
> > wl_surfaces, like the x,y arguments of wl_surface.attach, input<br>
> > and opaque regions, damage, and extension state like<br>
> > wl_viewport.destination. In general, anything expressed in<br>
> > surface local coordinates is better as surface state.<br>
> ><br>
> > The standard way of posting new content to a surface using the<br>
> > wl_surface requests damage, attach, and commit is called<br>
> > immediate content submission. This happens when a<br>
> > presentation.queue request has not been sent since the last<br>
> > wl_surface.commit.<br>
> ><br>
> > The new way of posting a content update is a queued content<br>
> > update submission. This happens on a wl_surface.commit when a<br>
> > presentation.queue request has been sent since the last<br>
> > wl_surface.commit.<br>
> ><br>
> > Queued content updates do not get applied immediately in the<br>
> > compositor but are pushed to a queue on receiving the<br>
> > wl_surface.commit. The queue is ordered by the submission target<br>
> > timestamp. Each item in the queue contains the wl_buffer, the<br>
> > target timestamp, and all the buffer state as defined above. All<br>
> > the queued state is taken from the pending wl_surface state at<br>
> > the time of the commit, exactly like an immediate commit would<br>
> > have taken it.<br>
> ><br>
> > For instance on a queueing commit, the pending buffer is queued<br>
> > and no buffer is pending afterwards. The stored values of the<br>
> > x,y parameters of wl_surface.attach are reset to zero, but they<br>
> > also are not queued; queued content updates do not carry the<br>
> > attach offsets. All other surface state (that is not queued),<br>
> > e.g. damage, is not applied nor reset.<br>
> ><br>
> > Issuing a queueing commit without a wl_surface.attach is<br>
> > undefined. However, queueing a commit with explicitly attached<br>
> > NULL wl_buffer works; when and if the content update is<br>
> > executed, the surface content is removed as defined for<br>
> > wl_surface.attach.<br>
> ><br>
> > If a queued content update has been submitted, and the wl_buffer<br>
> > used in the update is destroyed before the wl_buffer.release<br>
> > event, the results are undefined. The compositor may or may not<br>
> > have executed the update, therefore the surface contents become<br>
> > undefined as explained in wl_surface.attach. Whether any<br>
> > presentation feedback or frame callbacks occur is undefined.<br>
> ><br>
> > For each surface, the compositor maintains an association to a<br>
> > single output that is considered as the main output for the<br>
> > surface. Queued content updates are synchronized to the<br>
> > surface's main output, to provide a consistent and meaningful<br>
> > definition of the moment the update is displayed to the user.<br>
> > When a compositor updates an output, it processes only the<br>
> > queues of the surfaces whose main output is the one being<br>
> > updated. The queues of other surfaces, even if they are part of<br>
> > the redrawing, are not processed at that time.<br>
> ><br>
> > When a compositor chooses to update an output, it must predict<br>
> > the presentation clock value when the display update will occur.<br>
> > For the definition of the moment of display update, see<br>
> > presentation_feedback.presented. Therefore if the prediction is<br>
> > absolutely perfect, presentation_feedback.presented will carry<br>
> > the same clock value.<br>
> ><br>
> > For each surface with queued content updates and matching main<br>
> > output, the compositor picks the update with the highest<br>
> > timestamp no later than a half frame period after the predicted<br>
> > presentation time. The intent is to pick the content update<br>
> > whose target timestamp as rounded to the output refresh period<br>
> > granularity matches the same display update as the compositor is<br>
> > targeting, while not displaying any content update more than a<br>
> ><br>
><br>
> I'm not really following 100% here. It's not your fault, this is just a<br>
> terribly awkward sort of thing to try and put into English. It sounds to<br>
> me like the following: If P0 is the time of the next present and P1 is the<br>
> time of the one after that, you look for the largest thing less than the<br>
> average of P1 and P2. Is this correct? Why go for the average? The<br>
> client is going to have to adjust anyway.<br>
><br>
><br>
> > half frame period too early. If all the updates in the queue are<br>
> > already late, the highest timestamp update is taken regardless<br>
> > of how late it is. Once an update in a queue has been chosen,<br>
> > all remaining updates with an earlier timestamp in the queue are<br>
> > discarded.<br>
> ><br>
><br>
> Ok, I think what you are saying works. Again, it's difficult to parse but<br>
> these things always are.<br>
><br>
<br>
</div></div>Yes, it is hard to write a generic algorithm in English. Axel did a<br>
nice job clarifying it. I hope I can improve on the language after I<br>
have actually implemented this and any possible changes we need to this.<br>
<br>
Also, the inline documentation in the XML file is getting a bit out of<br>
hand, lacking in expressional power. I would have liked to use<br>
sub-headings, the algorithm could use pseudo-code, etc, but they just<br>
don't really exist here. Yet, I want these things to be part of the<br>
protocol spec, so the semantics of the protocol get properly defined.<br>
<div class=""><br>
> > 4.5. The frame callback and swap interval<br>
> ><br>
> > The frame callback needs to be with the buffer state, so it gets<br>
> > queued. If a client makes e.g. EGL's commits queued, EGL may<br>
> > still rely on frame callbacks for blocking apps properly, and<br>
> > that is related to presenting the buffer, not just the very next<br>
> > output refresh. EGL may also internally use queueing and<br>
> > feedback to implement swap interval > 1.<br>
> ><br>
><br>
> Doesn't this mean that you need eglSwapInterval(0) if you're queueing?<br>
> This is probably the case anyway, but it might be worth noting explicitly.<br>
> I think what you're doing with frame callbacks is sane, but I'm not sure.<br>
<br>
</div>Yeah, swapinterval zero is needed indeed. Personally I would be more<br>
worried about whether an EGL implementation agrees to allocate new<br>
buffers if the app is queueing in advance. I suspect queueing many<br>
frames in advance won't work with EGL in practice.<br>
<br>
But you can still queue a frame at a time, that might be enough for<br>
e.g. GL-based video players under good conditions. That might not need<br>
swapinterval zero, either.<br>
<div class=""><br>
> My one latent concern is that I still don't think we're entirely handling<br>
> the case that QtQuick wants. What they want is to do their rendering a few<br>
> frames in advance in case of CPU/GPU jitter. Technically, this extension<br>
> handles this by the client simply doing a good job of guessing presentation<br>
> times on a one-per-frame baseis. However, it doesn't allow for any damage<br>
> tracking. In the case of QtQuick they want a linear queue of buffers where<br>
> no buffer ever gets skipped. In this case, you could do damage tracking by<br>
> allowing it to accumulate from one frame to another and you get all of the<br>
> damage-tracking advantages that you had before. I'm not sure how much this<br>
> matters, but it might be worth thinking about it.<br>
<br>
</div>Does it really want to display *every* frame regardless of time? It<br>
doesn't matter that if a deadline is missed, the animation slows down<br>
rather than jumps to keep up with intended velocity?<br></blockquote><div><br></div><div>That is my understanding of how it works now. I *think* they figure the compositor isn't the bottle-kneck and that it will git its 60 FPS. That said, I don't actually work on QtQuick. I'm just trying to make sure they don't get completely left out in the cold.<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Axel has a good point, cannot this be just done client side and<br>
immediate updates based on frame callbacks?<br></blockquote><div><br></div><div>Probably not. They're using GLES and EGL so they can't draw early and just stash the buffer.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
If there is a problem in using frame callbacks for that, that is more<br>
likely a problem in the compositor's frame scheduling than the protocol.<br>
<br>
The problem with damage tracking, why I did not take damage as queued<br>
state, is that it is given in surface coordinates. This will become a<br>
problem during resizes, where the surface size changes, and wl_viewport<br>
is used to decouple the content from the surface space.<br></blockquote><div><br></div><div>The separation makes sense.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
If we queue damage, we basically need to queue also surface resizes.<br>
Without wl_viewport this is what happens automatically, as surface size<br>
is taken from the buffer size.<br>
<br>
However, in the proposed design, the purpose of wl_viewport is to<br>
decouple the surface size from buffer size, so that they can change<br>
independently. The use case is live video: if you resize the window,<br>
you don't want to redo the video frames, because that would likely<br>
cause a glitch. Also if the video resolution changes on the fly, by e.g.<br>
stream quality control, you don't need to do anything extra to keep the<br>
window at the old size. Damage is a property of the content update,<br>
yes, but we have it defined in surface coordinates, so when surface and<br>
buffer sizes change asynchronously, the damage region would be<br>
incorrect.<br>
<br>
The downside is indeed that we lose damage information for queued<br>
buffers. This is a deliberate design choice, since the extension was<br>
designed primarily for video where usually the whole surface gets<br>
damaged.<br></blockquote><div><br>Yeah, I think you made the right call on this one. Queueing buffers in a completely serial fassion really does seem to be a special case. Trying to do damage tracking for an arbitrary queue would very quickly get insane. Plus all the other problems you mentioned.<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
But, I guess we could add another request, presentation.damage, to give<br>
the damage region in buffer coordinates. Would it be worth it?<br>
<br>
Other solutions do not come into my mind at the moment.<br>
<br>
<br>
Thanks,<br>
pq<br>
</blockquote></div><br></div><div class="gmail_extra">It's looking good to me.<br>--Jason Ekstrand<br></div></div>