[PATCH 1/3] client: Send errors should be fatal but not abort

Tue Jan 22 12:31:55 UTC 2019

On Fri,  7 Sep 2018 18:14:03 -0700
Lloyd Pique <lpique at google.com> wrote:

> If the core-client code encounters an error when doing an internal flush
> because the send buffers are full, the previous behavior is to
> wl_abort(). This is not very friendly if the client is a nested server
> running as a service.
> 
> This patch instead sets a fatal display error, which is one step
> better, and causes the send logic to do nothing once a fatal display
> error is set. The client should eventually fall back to its main loop,
> and be able to detect the fatal error there.
> 
> The existing display test is extended to exercise the error state.
> 
> Signed-off-by: Lloyd Pique <lpique at google.com>
> ---
>  COPYING              |  1 +
>  src/wayland-client.c |  9 +++++-
>  tests/display-test.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 86 insertions(+), 1 deletion(-)
> 
> diff --git a/COPYING b/COPYING
> index eb25a4e..bace5d7 100644
> --- a/COPYING
> +++ b/COPYING
> @@ -2,6 +2,7 @@ Copyright © 2008-2012 Kristian Høgsberg
>  Copyright © 2010-2012 Intel Corporation
>  Copyright © 2011 Benjamin Franzke
>  Copyright © 2012 Collabora, Ltd.
> +Copyright © 2018 Google LLC

Hi Lloyd,

would you like to add this in wayland-client.c as well?

The overall logic looks good, but there are details to discuss below.

>  
>  Permission is hereby granted, free of charge, to any person obtaining a
>  copy of this software and associated documentation files (the "Software"),
> diff --git a/src/wayland-client.c b/src/wayland-client.c
> index 0ccfc66..29ae1e0 100644
> --- a/src/wayland-client.c
> +++ b/src/wayland-client.c
> @@ -736,6 +736,9 @@ wl_proxy_marshal_array_constructor_versioned(struct wl_proxy *proxy,
>  			goto err_unlock;
>  	}
>  
> +	if (proxy->display->last_error)
> +		goto err_unlock;
> +
>  	closure = wl_closure_marshal(&proxy->object, opcode, args, message);
>  	if (closure == NULL)
>  		wl_abort("Error marshalling request: %s\n", strerror(errno));
> @@ -744,7 +747,11 @@ wl_proxy_marshal_array_constructor_versioned(struct wl_proxy *proxy,
>  		wl_closure_print(closure, &proxy->object, true);
>  
>  	if (wl_closure_send(closure, proxy->display->connection))
> -		wl_abort("Error sending request: %s\n", strerror(errno));
> +		// Convert EAGAIN into EPIPE, since EAGAIN is not really
> +		// supposed to be fatal, especially when returned by
> +		// wl_display_flush().

The style is to use /* */ comments.

> +		display_fatal_error(proxy->display,
> +			errno != EAGAIN ? errno : EPIPE);
>  
>  	wl_closure_destroy(closure);

The logic in wayland-client.c looks good. I'm just wondering if EPIPE is
a good error code to return. EPIPE can result otherwise as well,
particularly when the server disconnects.

Could we use something else like ENOMEM or ENOSPC that would be more
unique? I see ENOMEM already used, so maybe ENOSPC?

I checked, and libwayland-client does not seem to be checking
last_error against EPIPE or really any other specific value, so I think
the concern is about what the users of the library will do. FWIW,
Weston has no occurrences of EPIPE.

>  
> diff --git a/tests/display-test.c b/tests/display-test.c
> index 9b49a0e..23f0635 100644
> --- a/tests/display-test.c
> +++ b/tests/display-test.c
> @@ -1315,3 +1315,80 @@ TEST(zombie_fd_errant_consumption)
>  
>  	display_destroy(d);
>  }
> +
> +static void
> +terminate_on_bind_attempt(struct wl_client *client, void *data,
> +			  uint32_t version, uint32_t id)
> +{
> +	wl_display_terminate(((struct display *)data)->wl_display);
> +}
> +
> +static void
> +fill_client_send_buffers(struct wl_display *display,
> +	void (*send_check)(struct wl_display *, int *), int *send_check_result)

Align this line with "struct" above and split the last argument to a
new line.

> +{
> +	int sync_count = 4096;
> +	int socket_buf = 1024;
> +
> +	/* Use the minimum send buffer size. We want the client to be
> +	 * able to fill the buffer easily. */
> +	assert(setsockopt(wl_display_get_fd(display), SOL_SOCKET, SO_SNDBUF,
> +		         &socket_buf, sizeof(socket_buf)) == 0);

One more space of alignment to right.

> +
> +	/* Fill up the client and server buffers With the default error
> +	 * handling, this will abort the client. */
> +	while(sync_count--) {

Add space between "while" and "(".

> +		wl_callback_destroy(wl_display_sync(display));

This should consume 12 bytes per call IIRC, so it should easily fill up
the 4096+1024 bytes of wl_buffer and socket send buffer. Ok.

> +		if (send_check != NULL) {
> +			send_check(display, send_check_result);
> +		}
> +	}
> +
> +	wl_display_roundtrip(display);
> +}
> +
> +static void
> +send_error_client(void)
> +{
> +	struct client *c = client_connect();
> +
> +	/* Try and bind a wl_set, which isn't used but has a side effect. */

typo: "wl_seat"

What side-effect are you relying on?

> +	wl_proxy_destroy((struct wl_proxy *)client_get_seat(c));

Why not use wl_seat_destroy()?

> +
> +	/* Verify we do not have any errors at this point */
> +	assert(wl_display_get_error(c->wl_display) == 0);
> +
> +	fill_client_send_buffers(c->wl_display, NULL, NULL);
> +
> +	/* Verify that we see a fatal display error from the send. */
> +	assert(wl_display_get_error(c->wl_display) == EPIPE);
> +
> +	client_disconnect_nocheck(c);

Looks good.

> +}
> +
> +TEST(client_send_error_tst)

How about "client_send_flood_error" or such?

> +{
> +	struct display *d = display_create();
> +
> +	test_set_timeout(2);
> +
> +	client_create_noarg(d, send_error_client);
> +
> +	/* The client must connect before simulating an unresponsive state. Use
> +	 * the request for a set to terminate the event loop. */

Is it a typo "seat"?

> +	wl_global_create(d->wl_display, &wl_seat_interface,
> +			 1, d, terminate_on_bind_attempt);
> +
> +	/* This handles the initial connect, then returns. */
> +	display_run(d);
> +
> +	/* Not processing any events for a moment should allow the buffers
> +	 * to be filled up by the client, and it to enter an error state then
> +	 * disconnect. */
> +	test_sleep(1);
> +
> +	/* We need one last dispatch to handle the terminated client. */
> +	wl_event_loop_dispatch(wl_display_get_event_loop(d->wl_display), 0);
> +
> +	display_destroy(d);
> +}

Yeah, this logic looks correct. I tried to think how to use the "stop
display" mechanism, but that does not seem to fit because you need the
server to stall while the client continues.

I don't like sleeps much, I'd prefer to wait until the client process
exits, but I don't see a good way to implement that. Waiting on the pid
would probably make handle_client_destroy() fail unexpectedly.

Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20190122/745a1600/attachment.sig>