[PATCH weston 6/7] Switch to global output repaint hook

Thu Feb 16 17:04:32 UTC 2017

Hi,

the subject should say "timer", not "hook" I presume.

For some background, here is first an explanation of how upstream
Weston works currently.

weston_output_schedule_repaint() gets called from all over the place,
including as a response to any damage caused. It gets called directly
from request handlers, so it is a relatively hot function. It's purpose
is to ensure the output repaint will start once the immediate hassle
(client requests) have been processed. For that it uses an idle
callback, which calls idle_repaint().

weston_output_schedule_repaint() guarantees that
output->repaint_scheduled is true, starting the continuous repaint
loop mode if it wasn't. The continuous repaint loop mode is flagged by
output->repaint_scheduled being true.

The idle_repaint() is armed and called only in the case the output
repaint was idle.

idle_repaint() simply calls output->start_repaint_loop().

output->start_repaint_loop() calls into the backend, which will call
weston_output_finish_frame() with the timestamp of the previous vblank.
This starts a repaint cycle.

weston_output_finish_frame() computes the repaint_delay and either
schedules a timer callback, or if it is late already, calls the timer
callback directly: output_repaint_timer_handler().

output_repaint_timer_handler() checks if there actually was any need to
repaint at all (output->repaint_needed, set by
weston_output_schedule_repaint() and cleared by
weston_output_repaint()). This check is done here, after the
repaint_delay, to give clients the opportunity to post updates before
we drop out from the continuous repaint loop. If the check was done any
earlier, we would be dropping out from the continuous repaint all the
time, having to start every repaint with a call to
start_repaint_loop(). Instead, if we know we updated on the very last
vblank, we have its timestamp already, and can avoid asking for it
redundantly.

If output_repaint_timer_handler() repaint_needed == false, it will also
clear repaint_scheduled flag, so that weston_output_schedule_repaint()
can start a new repaint cycle from idle. Continuous repaint cycle ends
here.

Otherwise, output_repaint_timer_handler() calls weston_output_repaint().

weston_output_repaint() updates surface states and damage, calls into
output->repaint(), clears the repaint_needed flag, sends out frame
callbacks, and ticks internal animations.

output->repaint() calls into the renderer, does what is necessary to
schedule an update to the screen, and guarantees that
weston_output_finish_frame() will be called when the update completes.
Notably, the repaint_scheduled flag is still true.

Then, either later or immediately, the backend calls
weston_output_finish_frame(), which sets up a new repaint_delay, giving
time for clients to update (clients cause damage, damage causes
repaint_needed to be set to true in weston_output_schedule_repaint()).
Go to output_repaint_timer_handler().

Then on to the subject matter.

On Tue, 14 Feb 2017 13:18:06 +0000
Daniel Stone <daniels at collabora.com> wrote:

> In preparation for grouping output repaint together where possible,
> switch the per-output repaint timer to a global timer which iterates
> across all outputs.
> 
> This is implemented by storing the absolute time for the next repaint
> for each output locally, and maintaining a global timer hook which
> iterates all of them, scheduling the repaint for the first available
> time.
> 
> Signed-off-by: Daniel Stone <daniels at collabora.com>
> Cc: Mario Kleiner <mario.kleiner.de at gmail.com>
> Cc: Pekka Paalanen <pekka.paalanen at collabora.co.uk>
> ---
>  libweston/compositor.c | 101 +++++++++++++++++++++++++++++++++++++------------
>  libweston/compositor.h |   3 +-
>  2 files changed, 79 insertions(+), 25 deletions(-)
> 
> diff --git a/libweston/compositor.c b/libweston/compositor.c
> index ed8ef90fd..7ba6b45bf 100644
> --- a/libweston/compositor.c
> +++ b/libweston/compositor.c
> @@ -2332,16 +2332,22 @@ weston_output_repaint(struct weston_output *output)
>  static void
>  weston_output_schedule_repaint_reset(struct weston_output *output)
>  {
> +	output->next_repaint = 0;
>  	output->repaint_scheduled = 0;
>  	TL_POINT("core_repaint_exit_loop", TLP_OUTPUT(output), TLP_END);
>  }
>  
> -static int
> -output_repaint_timer_handler(void *data)
> +static void
> +weston_output_maybe_repaint(struct weston_output *output,
> +			    struct timespec *now)
>  {
> -	struct weston_output *output = data;
>  	struct weston_compositor *compositor = output->compositor;
>  	int ret;
> +	uint32_t now_ms = timespec_to_msec(now);

See notes below on output_repaint_timer_arm() about the overflow.

> +
> +	/* We're not ready yet; come back to make a decision later. */
> +	if (output->next_repaint == 0 || now_ms < output->next_repaint - 1)
> +		return;
>  
>  	/* If we're sleeping, drop the repaint machinery entirely; we will
>  	 * explicitly repaint all outputs when we come back. */
> @@ -2354,17 +2360,65 @@ output_repaint_timer_handler(void *data)
>  	if (!output->repaint_needed)
>  		goto err;
>  
> +	output->next_repaint = 0;
> +
>  	/* If repaint fails, we aren't going to get weston_output_finish_frame
>  	 * to trigger a new repaint, so drop it from repaint and hope
> -	 * something schedules a successful repaint later. */
> +	 * something schedules a successful repaint later. As repainting may
> +	 * take some time, re-read our clock as a courtesy to the next
> +	 * output. */
>  	ret = weston_output_repaint(output);
> +	weston_compositor_read_presentation_clock(compositor, now);
>  	if (ret != 0)
>  		goto err;
>  
> -	return 0;
> +	return;
>  
>  err:
> +	output->next_repaint = 0;
>  	weston_output_schedule_repaint_reset(output);
> +}

weston_output_maybe_repaint() looks nice. In fact, I like this patch a
lot. Currently weston_output_repaint() is doing all the surface state
updates once per each output, but this architecture you are building
would allow us to do that once for all outputs getting repainted on the
same go.

> +
> +static void
> +output_repaint_timer_arm(struct weston_compositor *compositor)
> +{
> +	struct weston_output *output;
> +	uint32_t first_time = 0xffffffff;
> +	struct timespec now;
> +	uint32_t now_ms;
> +
> +	wl_list_for_each(output, &compositor->output_list, link) {
> +		if (output->next_repaint != 0 &&
> +		    output->next_repaint < first_time)
> +			first_time = output->next_repaint;
> +	}
> +	weston_compositor_read_presentation_clock(compositor, &now);
> +	now_ms = timespec_to_msec(&now);

'now' is an absolute time, and converting it to uint32_t msec will
truncate the value. It is possible to produce 0xffffffff as a valid
time.

I made an experiment on my machine, reading CLOCK_MONOTONIC:
ts 4412406.344457113, i64 0x106fffa48, u32 0x6fffa48

That is the timespec printed as decimal seconds, then the output of
timespec_to_msec() as int64_t, and finally cast to uint32_t. My machine
has uptime of 51 days, it seems to coincide with the monotonic clock
value, but is not guaranteed.

I believe we either need to choose an epoch and use int64_t msec or
nsec values as difference from the epoch, or just deal with struct
timespec. This is because the manual says about CLOCK_MONOTONIC:
"represents monotonic time since some unspecified starting point." The
base offset for the clock is arbitrary.

I you need a timespec_cmp(), you can steal one from here:
https://github.com/ppaalanen/wesgr/blob/master/graphdata.c#L192
I sent the same as part of presentation.queue implementation to
wayland-devel@ in 2014.

https://github.com/ppaalanen/wesgr/blob/master/wesgr.h#L218
has also timespec_invalidate() and timespec_is_valid(), FWIW.

> +
> +	if (first_time == 0xffffffff)
> +		return;
> +
> +	/* Now we switch first_time into relative units for the timer. */
> +	if (first_time <= now_ms + 1)
> +		first_time = 1;
> +	else
> +		first_time -= now_ms;
> +	wl_event_source_timer_update(compositor->repaint_timer, first_time);

This makes a minimum delay of 1 ms. I worry that if
output_repaint_timer_arm() (e.g. via weston_output_schedule_repaint()
which probably gets called quite often) gets called too often, the
timer will never expire.

Maybe you need an idle callback instead when we need to repaint ASAP?

However, see my note at weston_output_schedule_repaint() below.

Maybe you should just call output_repaint_timer_handler() directly if
the time has passed?

> +}
> +
> +static int
> +output_repaint_timer_handler(void *data)
> +{
> +	struct weston_compositor *compositor = data;
> +	struct weston_output *output;
> +	struct timespec now;
> +
> +	weston_compositor_read_presentation_clock(compositor, &now);
> +	wl_list_for_each(output, &compositor->output_list, link)
> +		weston_output_maybe_repaint(output, &now);
> +
> +	output_repaint_timer_arm(compositor);
> +
>  	return 0;
>  }
>  
> @@ -2377,17 +2431,22 @@ weston_output_finish_frame(struct weston_output *output,
>  	int32_t refresh_nsec;
>  	struct timespec now;
>  	struct timespec next;
> -	struct timespec remain;
> -	int msec_rel = 0;
> +	uint32_t now_ms;
> +	int msec_rel;
>  
>  	TL_POINT("core_repaint_finished", TLP_OUTPUT(output),
>  		 TLP_VBLANK(stamp), TLP_END);
>  
> +	weston_compositor_read_presentation_clock(compositor, &now);
> +	now_ms = timespec_to_msec(&now);

The overflow problem.

> +
>  	/* If we haven't been supplied any timestamp at all, we don't have a
>  	 * timebase to work against, so any delay just wastes time. Push a
>  	 * repaint as soon as possible so we can get on with it. */
> -	if (!stamp)
> +	if (!stamp) {
> +		output->next_repaint = now_ms;
>  		goto out;
> +	}
>  
>  	refresh_nsec = millihz_to_nsec(output->current_mode->refresh);
>  	weston_presentation_feedback_present_list(&output->feedback_list,
> @@ -2396,12 +2455,11 @@ weston_output_finish_frame(struct weston_output *output,
>  						  presented_flags);
>  
>  	output->frame_time = timespec_to_msec(stamp);

Oh boy... I noticed this only now. Overflow here too, and it has been
upstream for who knows how long. Luckily frame_time is not used for
anything important, just as the frame callback argument, internal
animation timings, and the video recorder...

> -	weston_compositor_read_presentation_clock(compositor, &now);
>  
>  	timespec_add_nsec(&next, stamp, refresh_nsec);
>  	timespec_add_msec(&next, &next, -compositor->repaint_msec);
> -	timespec_sub(&remain, &next, &now);
> -	msec_rel = timespec_to_msec(&remain);
> +	output->next_repaint = timespec_to_msec(&next);

Overflow.

> +	msec_rel = output->next_repaint - now_ms;
>  
>  	if (msec_rel < -1000 || msec_rel > 1000) {

msec_rel is checked for insanity here, reset if it is insane, but it is
not used for actual timing anymore. The insanity check should fix
output->next_repaint instead, or the check should be moved to
output_repaint_timer_arm().

>  		static bool warned;
> @@ -2419,13 +2477,10 @@ weston_output_finish_frame(struct weston_output *output,
>  	 * the deadline of the next frame, to give clients a more predictable
>  	 * timing of the repaint cycle to lock on. */
>  	if (presented_flags == WP_PRESENTATION_FEEDBACK_INVALID && msec_rel < 0)
> -		msec_rel += refresh_nsec / 1000000;
> +		output->next_repaint += refresh_nsec / 1000000;
>  
>  out:
> -	if (msec_rel < 1)
> -		output_repaint_timer_handler(output);
> -	else
> -		wl_event_source_timer_update(output->repaint_timer, msec_rel);
> +	output_repaint_timer_arm(compositor);
>  }
>  
>  static void
> @@ -2548,8 +2603,10 @@ weston_output_schedule_repaint(struct weston_output *output)
>  
>  	loop = wl_display_get_event_loop(compositor->wl_display);
>  	output->repaint_needed = 1;
> -	if (output->repaint_scheduled)
> +	if (output->repaint_scheduled) {
> +		output_repaint_timer_arm(compositor);

Tbqh, I don't understand why call this here?

>  		return;
> +	}
>  
>  	wl_event_loop_add_idle(loop, idle_repaint, output);
>  	output->repaint_scheduled = 1;
> @@ -4409,8 +4466,6 @@ weston_output_transform_coordinate(struct weston_output *output,
>  static void
>  weston_output_enable_undo(struct weston_output *output)
>  {
> -	wl_event_source_remove(output->repaint_timer);
> -
>  	wl_global_destroy(output->global);
>  
>  	pixman_region32_fini(&output->region);
> @@ -4592,7 +4647,6 @@ weston_output_enable(struct weston_output *output)
>  {
>  	struct weston_compositor *c = output->compositor;
>  	struct weston_output *iterator;
> -	struct wl_event_loop *loop;
>  	int x = 0, y = 0;
>  
>  	assert(output->enable);
> @@ -4633,10 +4687,6 @@ weston_output_enable(struct weston_output *output)
>  	wl_list_init(&output->feedback_list);
>  	wl_list_init(&output->link);
>  
> -	loop = wl_display_get_event_loop(c->wl_display);
> -	output->repaint_timer = wl_event_loop_add_timer(loop,
> -					output_repaint_timer_handler, output);
> -
>  	/* Invert the output id pool and look for the lowest numbered
>  	 * switch (the least significant bit).  Take that bit's position
>  	 * as our ID, and mark it used in the compositor's output_id_pool.
> @@ -5138,6 +5188,9 @@ weston_compositor_create(struct wl_display *display, void *user_data)
>  
>  	loop = wl_display_get_event_loop(ec->wl_display);
>  	ec->idle_source = wl_event_loop_add_timer(loop, idle_handler, ec);
> +	ec->repaint_timer =
> +		wl_event_loop_add_timer(loop, output_repaint_timer_handler,
> +					ec);
>  
>  	weston_layer_init(&ec->fade_layer, ec);
>  	weston_layer_init(&ec->cursor_layer, ec);
> diff --git a/libweston/compositor.h b/libweston/compositor.h
> index 9cdd682b3..959cd9492 100644
> --- a/libweston/compositor.h
> +++ b/libweston/compositor.h
> @@ -172,7 +172,7 @@ struct weston_output {
>  	pixman_region32_t previous_damage;
>  	int repaint_needed;
>  	int repaint_scheduled;
> -	struct wl_event_source *repaint_timer;
> +	uint32_t next_repaint;

Since this is an absolute time, it has the overflow issues as described
earlier.

>  	struct weston_output_zoom zoom;
>  	int dirty;
>  	struct wl_signal frame_signal;
> @@ -845,6 +845,7 @@ struct weston_compositor {
>  	struct wl_event_source *idle_source;
>  	uint32_t idle_inhibit;
>  	int idle_time;			/* timeout, s */
> +	struct wl_event_source *repaint_timer;
>  
>  	const struct weston_pointer_grab_interface *default_pointer_grab;
>  

Thanks,
pq