[PATCH weston v2] compositor-drm: pageflip timeout implementation

Pekka Paalanen ppaalanen at gmail.com
Thu Mar 2 13:56:12 UTC 2017


On Tue, 28 Feb 2017 18:35:11 +0000
Emmanuel Gil Peyrot <emmanuel.peyrot at collabora.com> wrote:

> Weston will not repaint until previous update has been acked by a
> pageflip event coming from the drm driver. However, some buggy drivers
> won’t return those events or will stop sending them at some point and
> Weston output repaints will completely freeze. To ease developers’ task
> in testing their drivers, this patch makes compositor-drm use a timer
> to detect cases where those pageflip events stop coming.
> 
> This timeout implementation is software only and includes basic
> features usually found in a watchdog. We simply exit Weston gracefully
> with a log message and an exit code when the timout is reached.
> 
> The timeout value can be set via weston.ini by adding a
> pageflip-timeout=<MILLISECONDS> entry under a new [compositor-drm]
> section. Setting it to 0 disables the timeout feature.
> 
> v2:
> - Made sure we would get both the pageflip and the vblank events before
>   stopping the timer.
> - Reordered the error and success cases in
>   drm_output_pageflip_timer_create() to be more in line with the rest
>   of the code.
> 
> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=83884
> Signed-off-by: Frederic Plourde <frederic.plourde at collabora.co.uk>
> Signed-off-by: Emmanuel Gil Peyrot <emmanuel.peyrot at collabora.com>
> ---
>  compositor/main.c          |  2 ++
>  libweston/compositor-drm.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++
>  libweston/compositor-drm.h |  7 +++++
>  man/weston.ini.man         |  5 ++++
>  4 files changed, 78 insertions(+)
> 

Hi,

there are some details to fix, but looking good overall.

> diff --git a/compositor/main.c b/compositor/main.c
> index 72c3cd10..e870dd4a 100644
> --- a/compositor/main.c
> +++ b/compositor/main.c
> @@ -1235,6 +1235,8 @@ load_drm_backend(struct weston_compositor *c,
>  	weston_config_section_get_string(section,
>  					 "gbm-format", &config.gbm_format,
>  					 NULL);
> +	weston_config_section_get_uint(section, "pageflip-timeout",
> +	                               &config.pageflip_timeout, 0);
>  
>  	config.base.struct_version = WESTON_DRM_BACKEND_CONFIG_VERSION;
>  	config.base.struct_size = sizeof(struct weston_drm_backend_config);
> diff --git a/libweston/compositor-drm.c b/libweston/compositor-drm.c
> index 7f78699c..e85ebab8 100644
> --- a/libweston/compositor-drm.c
> +++ b/libweston/compositor-drm.c
> @@ -119,6 +119,7 @@ struct drm_backend {
>  	int32_t cursor_height;
>  
>  	uint32_t connector;
> +	uint32_t pageflip_timeout;
>  };
>  
>  struct drm_mode {
> @@ -182,6 +183,8 @@ struct drm_output {
>  
>  	struct vaapi_recorder *recorder;
>  	struct wl_listener recorder_frame_listener;
> +
> +	struct wl_event_source *pageflip_timer;
>  };
>  
>  /*
> @@ -225,6 +228,44 @@ to_drm_backend(struct weston_compositor *base)
>  	return container_of(base->backend, struct drm_backend, base);
>  }
>  
> +static int
> +pageflip_timeout(void *data) {
> +	/*
> +	 * Our timer just went off, that means we're not receiving drm
> +	 * page flip events anymore for that output. Let's gracefully exit
> +	 * weston with a return value so devs can debug what's going on.
> +	 */
> +	struct drm_output *output = data;
> +	struct weston_compositor *compositor = output->base.compositor;
> +
> +	weston_log("Pageflip timeout reached, your driver is probably "
> +	           "buggy!  Exiting.\n");

While at it, you might print the output name here.

> +	weston_compositor_exit_with_code(compositor, EXIT_FAILURE);
> +
> +	return -1;

pageflip_timeout() is a timer callback, which is a libwayland-server
event loop dispatch vfunc. Returning -1 here might theoretically
confuse other dispatch vfuncs called from wl_event_loop_dispatch()'s
post_dispatch_check() phase. The post_dispatch_check() is intended to
loop until all dispatch funcs return zero.

However, a source must be explicitly marked to be processed in
post_dispatch_check() by calling wl_event_source_check(), and we don't
do that for timers in Weston. While it is harmless to return -1 in this
case, I would like to see the return value to be 0 to denote there is
no post-check needed.

> +}
> +
> +/* Creates the pageflip timer. Note that it isn't armed by default */
> +static int
> +drm_output_pageflip_timer_create(struct drm_output *output)
> +{
> +	struct wl_event_loop *loop = NULL;
> +	struct weston_compositor *ec = output->base.compositor;
> +
> +	loop = wl_display_get_event_loop(ec->wl_display);
> +	assert(loop);
> +	output->pageflip_timer = wl_event_loop_add_timer(loop,
> +	                                                 pageflip_timeout,
> +	                                                 output);
> +
> +	if (output->pageflip_timer == NULL) {
> +		weston_log("creating drm pageflip timer failed: %m\n");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
>  static void
>  drm_output_set_cursor(struct drm_output *output);
>  
> @@ -749,6 +790,10 @@ drm_output_repaint(struct weston_output *output_base,
>  		output_base->set_dpms(output_base, WESTON_DPMS_ON);
>  	}
>  
> +	if (output->pageflip_timer)
> +		wl_event_source_timer_update(output->pageflip_timer,
> +		                             backend->pageflip_timeout);
> +
>  	if (drmModePageFlip(backend->drm.fd, output->crtc_id,
>  			    output->next->fb_id,
>  			    DRM_MODE_PAGE_FLIP_EVENT, output) < 0) {
> @@ -872,6 +917,10 @@ drm_output_start_repaint_loop(struct weston_output *output_base)
>  	 */
>  	fb_id = output->current->fb_id;
>  
> +	if (output->pageflip_timer)
> +		wl_event_source_timer_update(output->pageflip_timer,
> +		                             backend->pageflip_timeout);
> +
>  	if (drmModePageFlip(backend->drm.fd, output->crtc_id, fb_id,
>  			    DRM_MODE_PAGE_FLIP_EVENT, output) < 0) {
>  		weston_log("queueing pageflip failed: %m\n");

In both cases above, you arm the timer before attempting to queue a
pageflip, but do not disarm the timer if the queueing fails.

How about moving the timer arming after the pageflip queueing succeeded?

The timer cannot protect against drmModePageFlip() hanging either
because in that case there would be nothing to dispatch the timer.

> @@ -919,6 +968,10 @@ vblank_handler(int fd, unsigned int frame, unsigned int sec, unsigned int usec,
>  		ts.tv_sec = sec;
>  		ts.tv_nsec = usec * 1000;
>  		weston_output_finish_frame(&output->base, &ts, flags);
> +
> +		/* Stop the pageflip timer instead of rearming it here */
> +		if (output->pageflip_timer)
> +			wl_event_source_timer_update(output->pageflip_timer, 0);

You should probably disarm the timer before calling
weston_output_finish_frame(), because finish_frame may call
weston_output_repaint() which then calls drm_output_repaint() which
intends to arm the timer.

>  	}
>  }
>  
> @@ -957,6 +1010,10 @@ page_flip_handler(int fd, unsigned int frame,
>  		ts.tv_nsec = usec * 1000;
>  		weston_output_finish_frame(&output->base, &ts, flags);
>  
> +		/* Stop the pageflip timer instead of rearming it here */
> +		if (output->pageflip_timer)
> +			wl_event_source_timer_update(output->pageflip_timer, 0);
> +

The same here.

>  		/* We can't call this from frame_notify, because the output's
>  		 * repaint needed flag is cleared just after that */
>  		if (output->recorder)
> @@ -2418,6 +2475,9 @@ drm_output_enable(struct weston_output *base)
>  
>  	output->dpms_prop = drm_get_prop(b->drm.fd, output->connector, "DPMS");
>  
> +	if (b->pageflip_timeout)
> +		drm_output_pageflip_timer_create(output);
> +
>  	if (b->use_pixman) {
>  		if (drm_output_init_pixman(output, b) < 0) {
>  			weston_log("Failed to init output pixman state\n");
> @@ -2532,6 +2592,9 @@ drm_output_destroy(struct weston_output *base)
>  		drmModeFreeCrtc(origcrtc);
>  	}
>  
> +	if (output->pageflip_timer)
> +		wl_event_source_remove(output->pageflip_timer);
> +
>  	weston_output_destroy(&output->base);
>  
>  	drmModeFreeConnector(output->connector);
> @@ -3309,6 +3372,7 @@ drm_backend_create(struct weston_compositor *compositor,
>  	b->sprites_are_broken = 1;
>  	b->compositor = compositor;
>  	b->use_pixman = config->use_pixman;
> +	b->pageflip_timeout = config->pageflip_timeout;
>  
>  	if (parse_gbm_format(config->gbm_format, GBM_FORMAT_XRGB8888, &b->gbm_format) < 0)
>  		goto err_compositor;
> diff --git a/libweston/compositor-drm.h b/libweston/compositor-drm.h
> index 2e2995a2..08707197 100644
> --- a/libweston/compositor-drm.h
> +++ b/libweston/compositor-drm.h
> @@ -138,6 +138,13 @@ struct weston_drm_backend_config {
>  	 */
>  	void (*configure_device)(struct weston_compositor *compositor,
>  				 struct libinput_device *device);
> +
> +	/** Maximum duration for a pageflip event to arrive, after which the
> +	 * compositor will consider the DRM driver crashed and will try to exit
> +	 * cleanly.
> +	 *
> +	 * It is exprimed in milliseconds, 0 means disabled. */
> +	uint32_t pageflip_timeout;

Extending the weston_drm_backend_config struct is done correctly here.
- There is no need to bump WESTON_DRM_BACKEND_CONFIG_VERSION because
  the existing fields remain intact.
- The default value is 0, which is also automatically filled in if an
  old config struct is passed in.

Nice.

>  };
>  
>  #ifdef  __cplusplus
> diff --git a/man/weston.ini.man b/man/weston.ini.man
> index 2edb0854..5ec0e1d1 100644
> --- a/man/weston.ini.man
> +++ b/man/weston.ini.man
> @@ -162,6 +162,11 @@ By default, xrgb8888 is used.
>  sets Weston's idle timeout in seconds. This idle timeout is the time
>  after which Weston will enter an "inactive" mode and screen will fade to
>  black. A value of 0 disables the timeout.
> +.TP 7
> +.BI "pageflip-timeout="milliseconds
> +sets Weston's pageflip timeout in milliseconds.  This sets a timer to exit
> +gracefully with a log message and an exit code of 1 in case the DRM driver is
> +non-responsive.  Setting it to 0 disables this feature.
>  
>  .IR Important
>  : This option may also be set via Weston's '-i' command

You put the weston.ini option in the core section, which made me
wonder, why is this feature specific to DRM. It is specific to DRM
indeed: most other backends already use a timer anyway, and the Wayland
backend would throttle based on the parent compositor which can delay
completion arbitrarily.

Then I wondered why DRM-specific options are in weston.ini and the core
section, but there is already one: gbm-format. These two could be moved
into the weston-drm man page, but that's a topic for another patch.

Very good to remember the manual.


Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20170302/119cd5c0/attachment.sig>


More information about the wayland-devel mailing list