Time in wayland presentation extension

Tue Dec 9 04:10:59 PST 2014

On Tue, 09 Dec 2014 11:21:31 +0100
Dan Oscarsson <Dan.Oscarsson at tieto.com> wrote:

> Hi
> 
> While I have been looking at Wayland for a long time I have not yet
> tried it.
> 
> I have for a long time worked with video players and used VDPAU as the
> best way to get video working well.
> I have not yet studied the Wayland protocols very much to see how things
> will work there. Hopefully we can get VDPAU for it too.
> vdpau have a timing and a presentation queue and it works very well.
> The basic needs for best video display is if you can:
>   - get info when (the time) vsyncs occur.
>   - get the time between vsyncs.
>   - schedule frame display at a specific vsync and get back if it was
> displayed correctly.

Hi,

yes, the more eyes on a spec, the merrier. Welcome. :-)

> Looking at the wayland presentation extension I see that you send time
> in the format: seconds plus nanoseconds.
> This is bad as you all the time have to convert between time and the
> split into seconds+nanoseconds.
> Why not just send nanoseconds?
> 64-bits of nanoseconds is long enough for many years in the future.

Y2K ;-)

Seriously though, why is it bad to use timespec format in the protocol?
clock_gettime uses struct timespec, and if you use anything less, you
risk losing something. It's hard to imagine the burden of converting or
even computing with it to be measurable.

But if you really have a benchmark to show that the difference is
measurable in, say, a video player, I would be very interested.

Anyway, what goes on the wire is a different thing than what is stored
inside your application. You are free to use any format you want. The
conversions will happen only at the protocol interface, which is in the
order of twice per frame in a video player.

We don't even have (and cannot add) a u64 wire type, so each 64-bit
quantity is split into two u32 anyway.

> Using a single value for time is much easier to use. When you schedule
> video frame display you do calculations on time and you do not want to
> keep time in split seconds+nanosecond values.

You can use a single value format, this is only the protocol.

However, the protocol deals with absolute timestamps, where the base is
unspecified from the protocol point of view, and a possible wraparound
might become a problem.

One also has to be able to read the clock that the compositor is using.
On Linux, it is defined that the clock can be read with clock_gettime,
which returns a struct timespec. Now if the protocol used something
else as the time format, it would need to define the conversion.
Defining a standard conversion in a future-proof way is something I
rather avoid and just send the complete data through.

This way you can choose the conversion that suits your data types the
best, be it u32 milliseconds, u64 nanoseconds, { s16 seconds, u16
milliseconds }, or whatever, while still being given the whole original
time data. If you lose anything, it's your own choice.

> I am not sure that the way the compositor picks the update to do is
> good. The protocol says "the update with the highest timestamp no later
> than a half frame period after the predicted presentation time". When I
> plan for a video frame to be displayed, I queue it at a suitable time
> before the vsync it should be display on. It may not be display on an
> earlier vsync. It may be that the plan here is to allow applications to
> queue frames for display without knowing when they will be displayed
> letting the compositor show them at the best vsync. But for me who want
> as good display as possible while keeping audio in sync, I calculate
> best vsync and then queue it for presentation - the compositor may not
> move the display time backwards - it may move it forward if it misses
> the correct vsync to the next vsync if the application is informed of
> that (in good time so the application can discard ned planned frame).

This is very much how Presentation queueing was designed to work.
Foremost it guarantees that frames cannot be displayed out of order,
where the order is dictated by the target timestamps.

The above semantics are intended to give zero latency on average if the
target times are uniformly distributed compared to vsync. This is good
for programs that do not lock on to the vsync cycle and maintain the
vsync<->clock association. It also good for display systems that do
not have a stable vsync frequency (FreeSync, Adaptive-Sync, Nvidia
G-sync). The compositor presumably knows the instantaneous frame cycle
length, but clients are always behind in that knowledge.

Those clients that do synchronize to vblanks explicitly and maintain the
association, they also know how the compositor picks the frame to be
presented. If you want "not before" semantics, you could mangle (or use
flags) your target timestamps to get it, assuming there is a stable
vsync frequency.

OTOH, if the vsync frequency is dynamic, you can't really lock your
frame queue to it. Instead, you need to rely on the compositor to lock
the vsync frequency to your frame queue. This is why I think the
"stupid program" case is more universal, even if it sacrifices accuracy
on contemporary displays by a half frame period.

We did also discuss queueing flags, where one flag would modify the
meaning of the given target timestamp to be strictly "not before". It
is easy to implement in a compositor, and IIRC I already have it in
some branch.

> It may be that you have two cases here:
>   1) the stupid program that just queues video frames without know when
> they will be displayed.
>   2) the advanced programs that schedule frames for display on the best
> vsyncs.
> And the presentation extension is mostly thought for 1).

Yes.

> Maybe you should take a look at vdpau and its presentation queue, if you
> have not done that already. It uses 64-bit timestamps and schedules
> frames in a way that is easy to use to get frames displayed when you
> want.

I think I did. I tried to find all video timing related APIs on Linux
when I researched for this extension. I also took into account that you
need to be able to implement GLX_OML_sync_control on top of this. These
are not easy to do, but it's the best I could do so far. I consulted
some Gstreamer experts, too.

I probably can't look into the Presentation queueing this year anymore,
but I intend to finalize Presentation feedback in two weeks. It's only
missing the feedback flags.

Comments welcome. :-)

Thanks,
pq