ignoring a process' last words: SIGPIPE ...

Thu Mar 27 08:51:03 PDT 2008

Hi Havoc,

On Thu, 2008-03-27 at 10:01 -0400, Havoc Pennington wrote:
> It's a known thing, see also https://bugs.freedesktop.org/show_bug.cgi?id=896

	Oh; grief :-) well, lets hope is will soon be an ex-known thing.

>      I'm not sure honestly how to fix this one. You have to fully dispatch the
> messages from a connection before you let that connection be shut down.

	Oh - so, it's not just a matter of reading them ? Ultimately - we
should only have 1 more read's left of data - and it should be a full
message and the:

  /* Disconnect in case of an error.  In case of hangup do not
   * disconnect the transport because data can still be in the buffer
   * and do_reading may need several iteration to read it all (because
   * of its max_bytes_read_per_iteration limit).  The condition where
   * flags == HANGUP (without READABLE) probably never happen in fact.
   */
  if ((flags & DBUS_WATCH_ERROR) ||
      ((flags & DBUS_WATCH_HANGUP) && !(flags & DBUS_WATCH_READABLE)))

	should handle this case right ?

	At least - there should be no problem leaving the HANGUP state
hanging(sic) around until we've finished reading right ? :-) when we
poll again we can get the state again - AFAICS we do that currently
anyway.

>      We could try just popping each message off the connection and
> dispatching it at the top of bus_connection_disconnected(). Seems
> like it could create some pretty scary reentrancy issues though.

	Yep sounds bad.

>      It'd be good to have test suite coverage for "connection
> disconnecting while all its messages have not been dispatched"

	That will require some delay-injection into the daemon, but shouldn't
be insurmountable - any chance of a patch to do that'd be great.

> A workaround if dbus-send is calling a method would be to block for
> the method reply before exiting.

	I guess.

> I don't think just ignoring EPIPE would fix, since the bug is broader
> than that; it happens if the connection is lost for any reason before
> all messages are read from it.

	Surely we get EPIPE in all the interesting cases here.

> There is a related bug that we fail to know the credentials of the
> connection if it's already closed, so can't correctly process the
> messages even if the daemon were changed to dispatch them.

	Oh; that OTOH may be un-fixable; but AFAIR the 'Hello' is synchronous -
so, is that so bad ?

> Anyway, I think the daemon requires some hacking to basically keep
> credentials and remaining incoming messages around even though the
> connection is closed. I don't know if this will be easy or involve
> some challenges. I would start by adding a case to the daemon test
> suite (which is mostly stuffed in bus/dispatch.c) illustrating the
> problem.

	So I attach a patch that is a hack, but solves the problem for me - it
probably kills small, cute animals in the process (etc. etc.) YMMV.

	Here is how it works:

	a) we ignore SIGPIPE on writes: there may be un-read data in the
	   kernel socket buffer that we really want - even though the 
	   remote 'read' channel is closed.
	b) if we have a WRITE and a READ poll pending, and we get a 
	   hangup, we wait to do the hangup until the READ poll
		[ after we have read the pending data ]

	The fix seems at some level sane - though - really we don't want to
have 2 poll records when 1 is sufficient - and if we could re-factor the
READ and WRITE polls into a single poll record [ as in ORBit2 ] then we
wouldn't need 'b)' at all. Also, peripherally, we would avoid a vicious
'poll' bug in some 2.4.x kernels.

	How does that sound ?

	Thanks,

		Michael.

-- 
 michael.meeks at novell.com  <><, Pseudo Engineer, itinerant idiot

-------------- next part --------------
A non-text attachment was scrubbed...
Name: dbus-sigpipe-fix.diff
Type: text/x-patch
Size: 2587 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/dbus/attachments/20080327/9c452cbe/attachment.bin