[PATCH] server: Fix crash when accessing client which is already freed

Mon Nov 21 14:05:24 UTC 2016

Hi Pekka,

On 21 November 2016 at 13:30, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> On Mon, 21 Nov 2016 12:19:43 +0000 Daniel Stone <daniel at fooishbar.org> wrote:
>> I'm would suggest we document that invoking wl_display_flush_clients() /
>> wl_connection_flush() from a request handler may cause undefined behaviour,
>> and that it must only be called outside of dispatch.
>
> FWIW, I agree with Daniel here. There are several things that are not
> quite defined or expected to be done inside dispatch calls, and
> flushing all clients is one of them.
>
> I could see, IIRC, how sending lots of events from inside dispatch
> could cause the one client to be flushed. I think flush is done
> automatically if the send buffer fills up. If the flush at that point
> fails, there is no way to signal back that "oops, the compositor should
> put that thing aside and do something else" until the connection
> unblocks. We don't want to be checking and handling every send-event
> possibly failing, so killing the client can happen. ISTR we don't
> dynamically grow the send buffers, do we? But even the growing would
> need to stop somewhere.

Two paths, depending on whether the event is queued or immediately
posted. If the latter, then the event will be flushed immediately. If
the former, _and_ the queue is already full, then the queue will be
flushed immediately before the event is written to the queue. In
either case, if the flush fails, then client->error will be set. Since
we work exclusively on NOWAIT connections, this includes if the kernel
buffers are full because the client hasn't yet read.

Either way, this is not related to the above patch, because it takes a
different codepath (wl_resource_queue_event_array -> wl_closure_queue
-> wl_connection_queue -> wl_connection_flush /
wl_resource_post_event_array -> wl_closure_send -> wl_connection_write
-> wl_connection_flush), not involving calling
wl_display_flush_clients(). This implicit mid-dispatch flush is
completely fine: the patch only fixes the compositor _explicitly_
flushing _all_ clients in the middle of request dispatch.

(Side note: I suspect there might be some scope for increasing our -
currently static - buffer limit, but making that limit too high just
shifts the problem of unresponsive clients. But this also does not
have any bearing on this patch.)

Cheers,
Daniel