Recursive dispatch (Re: [PATCH v2] server: Calculate remaining data size after a closure is processed)

Pekka Paalanen ppaalanen at gmail.com
Wed Feb 3 11:37:21 UTC 2016


On Wed, 3 Feb 2016 10:52:07 +0000
Auke Booij <auke at tulcod.com> wrote:

> On 3 February 2016 at 09:34, Jonas Ådahl <jadahl at gmail.com> wrote:
> > On Wed, Feb 03, 2016 at 11:14:54AM +0200, Pekka Paalanen wrote:  
> >> On Wed, 3 Feb 2016 09:56:20 +0900
> >> "Jaeyoon Jung" <jaeyoon.jung at lge.com> wrote:
> >>  
> >> > > -----Original Message-----
> >> > > From: Derek Foreman [mailto:derekf at osg.samsung.com]
> >> > > Sent: Wednesday, February 03, 2016 6:56 AM
> >> > > To: Pekka Paalanen; Jaeyoon Jung
> >> > > Cc: 'Jonas Ådahl'; wayland-devel at lists.freedesktop.org
> >> > > Subject: Re: Recursive dispatch (Re: [PATCH v2] server: Calculate remaining
> >> > > data size after a closure is processed)
> >> > >
> >> > > On 02/02/16 06:57 AM, Pekka Paalanen wrote:  
> >> > > > On Tue, 12 Jan 2016 13:03:19 +0900 "Jaeyoon Jung"
> >> > > > <jaeyoon.jung at lge.com> wrote:
> >> > > >  
> >> > > >>> -----Original Message----- From: wayland-devel
> >> > > >>> [mailto:wayland-devel-bounces at lists.freedesktop.org] On Behalf
> >> > > >>> Of Jonas Adahl Sent: Tuesday, January 12, 2016 12:27 PM To:
> >> > > >>> Jaeyoon Jung Cc: wayland-devel at lists.freedesktop.org Subject:
> >> > > >>> Re: [PATCH v2] server: Calculate remaining data size after a
> >> > > >>> closure is processed
> >> > > >>>
> >> > > >>> On Tue, Jan 12, 2016 at 08:22:56AM +0900, Jaeyoon Jung wrote:
> >> > > >>>  
> >> > > >>>> When processing a closure, data in the connection can be
> >> > > >>>> consumed again if the closure itself invokes extra event
> >> > > >>>> dispatch. In that case the remaining data size is also
> >> > > >>>> altered, so the variable len should be updated after the
> >> > > >>>> closure is processed.
> >> > > >>>>
> >> > > >>>> Signed-off-by: Jaeyoon Jung <jaeyoon.jung at lge.com>  
> >> > > >>>
> >> > > >>> I don't like the name wl_connection_size
> >> > > >>> (wl_connection_pending_input() or something would be more
> >> > > >>> descriptive).  
> >> > > >>
> >> > > >> I just wanted to have a shorter name.
> >> > > >> wl_connection_pending_input() looks better to me as well, and
> >> > > >> I'll update the patch accordingly. Thanks.
> >> > > >>
> >> > > >>  
> >> > > >>> A good follow up would be a test case where this would
> >> > > >>> otherwise cause issues. I assume it would be triggered by a
> >> > > >>> wl_event_loop_dispatch() in a interface request handler?  
> >> > > >>
> >> > > >> Yes, exactly.  
> >> > > >
> >> > > > Hi,
> >> > > >
> >> > > > could you humor me and explain in what sort of situations
> >> > > > dispatching from within a request handler is useful?  
> >> >
> >> > Suppose a server has a single event queue for all kind of events
> >> > including wayland events. If the server wants to process more events
> >> > from the queue in the middle of the request handler, and there is
> >> > another wayland request arrived in the queue, it induces a recursive
> >> > dispatch. I'm not saying that this is a good practice though.  
> >>
> >> Right, this is a tautology: "the server wants to process more
> >> events in the middle of a request handler" is the very definition of
> >> recursive dispatch, not a justification for it.
> >>
> >>  
> >> > > > I thought recursive dispatch is a programming pattern that can
> >> > > > only lead to painful and erratic problems later. Don't we consider
> >> > > > this to be bad practice also in the client API?  
> >> > >
> >> > > Is that actually documented somewhere?  
> >>
> >> I don't think it is. It would be worth to make an official statement
> >> about it in the docs.
> >>
> >> To me it is common sense (or expert knowledge) about event loops in
> >> general.
> >>  
> >> > > > If you do it in the server, that is like: this message causes more
> >> > > > messages to be dispatched - how does that make sense? A server must
> >> > > > not block waiting for any client message.  
> >> > >
> >> > > I agree that a server blocking on a client message is a great way to
> >> > > create a terrible user experience - but I'm not certain it's
> >> > > libwayland's mandate to prevent it?  
> >>
> >> Libwayland could prevent recursive dispatch because it has not been
> >> intended to work. This patch proves that is actually was broken.
> >>
> >> Preventing recursive dispatch does not prevent writing a server that
> >> blocks waiting on a client message. You can very well do that without
> >> recursive dispatch, but it's just a little harder to write.
> >>  
> >> > > > Also, inspecting future messages before finishing the processing of
> >> > > > the current message is racy: the following messages may or may not
> >> > > > be in the receive buffer and nothing guarantees either way, you
> >> > > > just get lucky a lot of times. Wayland IPC does not guarantee more
> >> > > > than one message to be delivered at a time, this is why we have
> >> > > > things like wl_surface.commit request.  
> >> > >
> >> > > This is a really good point. :)
> >> > >  
> >> > > > If someone starts hardening libwayland against programmer errors,
> >> > > > I'd expect one thing to be to abort() on recursive dispatch. Can
> >> > > > you tell me why we should support recursive dispatch?  
> >> > >
> >> > > IMHO we should either do that immediately (before the 1.10 release) or
> >> > > revert this patch now.  Not that anything is wrong with the patch - it
> >> > > doesn't introduce bugs in libwayland...
> >> > >
> >> > > Previously recursive dispatch was impossible, with this patch it's
> >> > > possible.  The patch was landed to allow recursive dispatch to work,
> >> > > and it's obvious someone's about to start using it.  
> >>
> >> You're right, we should make a statement about this. I'd be happy to
> >> accept a doc patch to be included in 1.10 as AFAIK we have never
> >> intended recursive dispatch to work reliably in either server or client
> >> code. I do not consider it as a change in API/ABI or behaviour, just a
> >> doc improvement.
> >>  
> >> >
> >> > To be more exact, previously recursive dispatch allowed a message copy
> >> > from empty buffer and resulted in an irrelevant error post to the client.
> >> > If we want to disallow recursive dispatch, it would be nice to raise an
> >> > error like abort() when it is detected. Reverting this patch is just
> >> > going back to the past without resolving the issue.  
> >>
> >> Raising an error like abort() is a more touchy subject, because it
> >> indeed has potential to break existing programs that work by luck. We
> >> probably want an escape hatch for a while, like an environment variable
> >> to not abort() and just complain. Or maybe complaining is enough at
> >> first and add the abort() after a couple stable releases.
> >>
> >> I won't propose anyone to do that work now, because I am hoping someone
> >> might do it as a part of GSoC (yes, there are plans for GSoC, just not
> >> announced yet I think).
> >>  
> >> > I agree that recursive dispatch might be harmful and should be avoided.
> >> > However, I'm not sure if it is something must be done in libwayland.
> >> > Isn't it sufficient to be documented somewhere and let the implementor
> >> > make a decision?  
> >>
> >> If we document that recursive dispatch is not supported, then we won't
> >> bother with patches to fix problems with recursive dispatch, which
> >> implies that we could also just revert this patch since we won't care.
> >> But the patch is in, and I don't think we win anything by reverting it.
> >>
> >> If some usage pattern is known broken, I think it would be nice to
> >> catch them and at least complain in libwayland. So far we haven't added
> >> such code for even every obvious mistake, like destroying objects while
> >> other objects are still referencing them.
> >>  
> >> > > If we later intentionally break that code, we're jerks.
> >> > > -- From the jerk that posted a bunch of patches to do more stringent
> >> > > checks on resource versions a few days ago...  arguably those have
> >> > > always been bugs, though. ;)  
> >>
> >> Yeah, recursive dispatch is less clear, so we should have a transition
> >> period.
> >>  
> >> > > So, with the release coming so quickly, I think we should either
> >> > > decide whether recursive dispatch is legal or illegal before the
> >> > > release, or revert the patch (thus re-breaking recursive dispatch) and
> >> > > worry about it for 1.11.  
> >>
> >> I would declare recursive dispatch as illegal. Does anyone object?  
> >
> > I don't object declaring it illegal in libwayland-server since it was
> > previously broken, but doing so in libwayland-client I don't think we
> > should (at least yet) since it is used here and there in various clients
> > already.  

Ok, let's do it for server since that's what we are looking at right
now. (Warnings and aborts would be implemented later.)

Client side is indeed more complicated, since it provably works
somewhat, we've had it in some Weston demos even, that were since fixed
AFAIK.

Maybe we could start client side sanitation by these steps release by
release:

1. Document that recursive dispatch is problematic and recommend
   against it.

2. Add a warning when a recursive dispatch is detected, maybe once per
   wl_display (or wl_event_queue?).

3. Make the warning be printed every time, not just once.

4. Make the warning also abort().

Now, how far we will go depends on the reception. I think we should do
1, and maybe stop at 2. We could try 3 and see how much feedback it
generates.

It's also possible that we have to stop at 1, because it is too easy to
end up in an event handler calling a third party library routine which
waits for an event. Waiting means it has to dispatch. Calling
eglSwapBuffers from a frame callback handler, for instance.

Or maybe we should do it per wl_event_queue, so that if dispatching an
event queue ends up dispatching itself, warn only then. Dispatching the
same queue is the biggest problem, right?

Btw. are we sure that client side does not have the same problem as the
server side?

I don't think we have any tests on the matter.

> Recursive dispatch sounds like an invitation to a memory leak (by
> unbounded recursion).

Yeah, in theory at least.

> Additionally, I'm not even sure how it'd work on the client side,
> since wl_display_dispatch_queue_pending first acquires the display
> lock: so doesn't a recursive dispatch deadlock?

I think the client side makes sure to hold no locks while calling the
handlers and re-takes them when the handler returns. Otherwise we might
have a deadlock on a handler sending a request.


Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/wayland-devel/attachments/20160203/889d382b/attachment.sig>


More information about the wayland-devel mailing list