Time in wayland presentation extension

Tue Dec 9 07:32:52 PST 2014

Hi,

On Tue, 09 Dec 2014 14:49:01 +0100
Dan Oscarsson <Dan.Oscarsson at tieto.com> wrote:

> tis 2014-12-09 klockan 14:10 +0200 skrev Pekka Paalanen:
> > On Tue, 09 Dec 2014 11:21:31 +0100
> > Dan Oscarsson <Dan.Oscarsson at tieto.com> wrote:
> > > Looking at the wayland presentation extension I see that you send time
> > > in the format: seconds plus nanoseconds.
> > > This is bad as you all the time have to convert between time and the
> > > split into seconds+nanoseconds.
> > > Why not just send nanoseconds?
> > > 64-bits of nanoseconds is long enough for many years in the future.
> > 
> > Y2K ;-)
> Yes, but you will get that with two 32-bit values too.

No, seconds is 64-bit in the protocol currently.

> > 
> > Seriously though, why is it bad to use timespec format in the protocol?
> > clock_gettime uses struct timespec, and if you use anything less, you
> > risk losing something. It's hard to imagine the burden of converting or
> > even computing with it to be measurable.
> 
> 64-bits in nanoseconds can contain longer time than 32-bit for seconds
> and 32-bits for nanoseconds. So you do not lose anything by specifying
> time as a 64-bit nanoseconds value.
> 
> > 
> > But if you really have a benchmark to show that the difference is
> > measurable in, say, a video player, I would be very interested.
> > 
> > Anyway, what goes on the wire is a different thing than what is stored
> > inside your application. You are free to use any format you want. The
> > conversions will happen only at the protocol interface, which is in the
> > order of twice per frame in a video player.
> 
> Yes, it is not that bad. But looking at code in mplayer/mpv there are
> many places with time conversions: struct timespec -> floating - double
> - 32-bit - 64-bits. A large mess, many places where a misstake can be
> done. It would be much cleaner with a single simple representation of
> time that could be used anywhere. The simplest is one integer value or
> one double value (though double might lose some nanoseconds in some
> calculations I suspect).

Unfortunately that is a not a problem that Wayland could solve.

Also timestamps vs. time intervals often deserve different types. For
instance float is not appropriate for a timestamp at all while double
might. And yes, the computations you want to do affect also what type
want to choose.

I do not think there is one type that can fit all.

Luckily the type used in the protocol only needs to represent a value
accurately.

> > 
> > We don't even have (and cannot add) a u64 wire type, so each 64-bit
> > quantity is split into two u32 anyway.
> 
> When all modern hardware supports 64 bits well.

Yeah, it was an oversight, that got written into stone, when libwayland
ABI was stabilized. We realised it only long after the fact.

> Though it is very easy to send 64 bits as two 32-bit values. It is much
> worse to convert an integer into seconds and nanoseconds - especially if
> you want to do it on old 32-bit hardware.
> 
> Anyway, I mostly wanted a simple value in the protocol. Just like I
> would have preferred clock_gettime to return a 64-bit value instead.

Sorry, I'm quite hesitant to go back there.

> > The above semantics are intended to give zero latency on average if the
> > target times are uniformly distributed compared to vsync. This is good
> > for programs that do not lock on to the vsync cycle and maintain the
> > vsync<->clock association. It also good for display systems that do
> > not have a stable vsync frequency (FreeSync, Adaptive-Sync, Nvidia
> > G-sync). The compositor presumably knows the instantaneous frame cycle
> > length, but clients are always behind in that knowledge.
> 
> I have expected FreeSync, Adaptive-Sync, Nvidia G-sync to actually allow
> the video player (not the compositor) to play video and get each frame
> to be displayed at exactly the correct time for all movie speeds. It
> would also handle videos with variable frame rate to be display
> correctly.

Yes, but always through the compositor. It can bypass compositing, but
not the compositor.

X11 has special "let's use a hardware overlay" protocol extension, but
we deliberately do not want such thing in Wayland. The compositor is
always in full control of all presentation. Using a hardware overlay is
an internal decision that the compositor does every output frame. This
way the compositor will composite when it has to, and use hardware
resources when it can. This leads to optimal usage of display hardware
resources without complicated negotiations with clients.

With dynamic refresh, the compositor could adapt to the requirements of
all the running clients. Most clients just present frames "ASAP", but
if a video player uses queueing, it obviously cares about timing, and
the compositor can prioritize it. The point is, that the compositor
knows everything that happens on the display, so it can make optimal
decisions. Clients cannot hurt other clients by hijacking hardware
resources. Clients also cannot do optimal decisions about hardware
usage, because one client does not know what other clients do.

When the compositor manages the hardware, we also trivially avoid cases
like:
- player 1 grabs the hw overlay
- player 1 plays a bit
- player 2 starts, the overlay is already taken, have to fall back
- player 1 finishes and releases the overlay
- player 2 continues while hw overlay remains unused

That is just the tip of the iceberg that are the problems related to
an explicit "give me a hw overlay" protocol.

> > OTOH, if the vsync frequency is dynamic, you can't really lock your
> > frame queue to it. Instead, you need to rely on the compositor to lock
> > the vsync frequency to your frame queue. This is why I think the
> > "stupid program" case is more universal, even if it sacrifices accuracy
> > on contemporary displays by a half frame period.
> 
> Before I got a modern LCD tv which included frame interpolation the
> world was simpler for display of video. But now you have to display
> frames at correct vsync or the video will jump which is bad for viewing.

I fully intend you to be able to target the correct vsync when you so
wish and the vsync is also predictable by the client.

That's a good point about TVs.

Does frame interpolation have any overlap with dynamic vsync? I mean,
can they exist at the same time in the same display device? Would that
ever make sense?

> Having a compositor complicates things.
> When running i full screen mode, the video player should be totally in
> control of refresh rate and when to show frames. This works fine for
> both fixed and dynamic refresh rate.

No, that's not how Wayland is designed. We cannot give direct hardware
access to clients. The purpose of a window system is to make apps
cooperate. No client can overrule the compositor decisions. It protects
users from malfunctioning apps, and it also improves security.

But, e.g. for fullscreen windows, we have had the idea for a client to
say that they would prefer a certain video mode and refresh rate. It is
still up to the compositor to decide when it honours that, if at all.
This would be part of the public shell protocol.

Even without that, if the video player is fullscreen and using
queueing, and there is nothing else to show on the display, the video
player would effectively be in control of the dynamic refresh through
the frame target timestamps.

If that doesn't work for you, you might want to run the video player
without a window system, straight on DRM. Then it is in total control.
A lot of the infrastructure that makes Wayland compositors possible
also makes it more feasible to implement this in a video player.

> But when video is not in full screen mode, it can work fine with fixed
> refresh rate (if compositor displays video at requested vsync) but when
> using dynamic refresh rate - who should decided when vsyncs should
> occur? Video player or compositor? And you could actually have more than
> one video player at the same time. It could probably be handled by
> compositor setting vsyncs at all video players requested vsyncs and
> those needed for other windows.

The compositor always decides.

I touched this topic above a little. I would imagine that if there is
clearly one timing critical source (a single video window using
queueing), the compositor can choose to vsync to it. If there are more
video windows with conflicting timings, the compositor can attemp to
find a close enough common multiple of the framerates, or just fall
back to a default fixed rate.

One could also avoid dropping other apps down to the video framerate by
doing additional vsyncs between the vsyncs required by the video, if
possible.

All this is just hypothetical, since I don't think anyone here has got
to play with dynamic vsync displays.

> As a side note - do not know if it is fixed in Wayland - when I am using
> Gnome 3.10 with several screens their compositor syncs updating to the
> screen with lowest refresh rate. I hope this is not so with Wayland. I
> have a LCD tv running at 24 Hz, and a normal monitor running at 60 Hz.
> It gets really bad to have the compositor just updating at 24 hz on the
> 60 hz screen. The correct solution is to update each screen with the
> refresh rate of the screen - more complex but much nicer to look at.

That is mostly a compositor issue. Wayland is largely irrelevant there.
(However, for the problem under X, X.org is relevant, and X11 maybe.
Maybe you wouldn't have that problem if the monitors were separate
SCREENs, i.e. zaphod mode, but with the other caveats it brings.)

Compositors are perfectly able to refresh each display individually,
and that is what Weston does. There is a repaint loop per output.

The only implication from Wayland is that a surface (window) is always
synchronized to one output as far as the client is concerned. So when
a client asks "tell me when this frame is presented", you get back only
one timestamp or callback, not one for each output the surface is
showing on.

It doesn't mean that a window that is half on output A and half on
output B would tear on either, no. It only means that the feedback to
clients about presentation assume there is one output where this
surface is synchronized to. The same applies to processing the
presentation queue in the compositor in the proposal, too.

The output where a surface is synchronized is dynamic. Weston chooses
the output based on largest intersection area at all times. This is why
Presentation feedback has the sync_output event to say which one it was.

> > 
> > > Maybe you should take a look at vdpau and its presentation queue, if you
> > > have not done that already. It uses 64-bit timestamps and schedules
> > > frames in a way that is easy to use to get frames displayed when you
> > > want.
> > 
> > I think I did. I tried to find all video timing related APIs on Linux
> > when I researched for this extension. I also took into account that you
> > need to be able to implement GLX_OML_sync_control on top of this. These
> > are not easy to do, but it's the best I could do so far. I consulted
> > some Gstreamer experts, too.
> > 
> > I probably can't look into the Presentation queueing this year anymore,
> > but I intend to finalize Presentation feedback in two weeks. It's only
> > missing the feedback flags.
> > 
> > Comments welcome. :-)
> 
> Unfortunately I think I will not have time for a deeper study at the
> moment. The most important things I need back is: when a frame was
> presented (so I can see if it was presented at correct vsync or not) or
> if it was dropped.

Yes, that is included:
http://cgit.freedesktop.org/wayland/weston/tree/protocol/presentation_timing.xml

That is the current upstream version of Presentation, which is lacking
queueing. Queueing is still lots of work:
https://bugs.freedesktop.org/show_bug.cgi?id=83092

> As we may get dynamic refresh rates, there might be need of a feedback
> saying if screen refresh rate is fixed of dynamic.

Yeah, might need that. Currently in Presentation feedback, the
compositor can set the 'refresh' argument to zero nanoseconds when it
cannot meaningfully predict the next vsyncs. It's pretty vague still.

> If I have queued a frame for update, can I cancel that if it has not
> been display yet? vdpau is missing that so if you queue several frames
> in advance you cannot stop displaying immediately.

Yes, you can cancel, but in the proposal the only option is to cancel
all queued frames. Would be a lot more complicated to allow cancelling
individual frames and there hasn't been a use case for it yet.

I think the latest series that also includes queueing is
http://cgit.collabora.com/git/user/lfrb/weston.git/log/?h=presentation

Thanks,
pq