dbus_connection_send_with_reply() shouldn't return TRUE on a disconnected connection

Fri Apr 17 17:03:16 PDT 2009

Hi,

On Fri, Apr 17, 2009 at 7:04 PM, Lennart Poettering <mzqohf at 0pointer.de> wrote:
> Be that as it will. Right now however the caller cannot handle non-OOM
> errors at all.
...
> I can't say I particularly like it that DBusConnection might just eat
> my messages without saying anything.

This isn't accurate. If there's an error you'll get a NULL DBusPendingCall.

> The reason I was looking at this is that I am hacking unix fd passing
> support into dbus. Now, i'd like to make
> dbus_connection_send_with_reply() and friends fail if you try to send
> a DBusMessage that includes unix fds on a connection that doesn't
> support sending fds. Question is in absence of a seperate DBusError
> parameter what's the best way to handle this? Return TRUE? Return
> FALSE?

You could do a NULL DBusPendingCall, but I think this may even be a
return_if_fail case, if your API contract requires first checking for
fd-passing support before sending the message. i.e. if the caller is
required to do:

 if (connection_supports_fd(connection)) {
    send_message_with_fd();
 }

Then you can just return_if_fail on send_message_with_fd(). If the
caller isn't required to do that, then you can't consider this a
programming error and need to signal a runtime error.

A runtime-error alternative to NULL pending call, would be to just
send the message, and have the peer return an error. A peer that
doesn't understand the file descriptor typecode will drop the
connection; not sure what you patched the spec to say if an
fd-typecode-understanding peer is on a machine that does not support
fd-passing, but one possibility is that the peer returns an error
reply. The error reply could be the same one that would happen if the
syscalls to collect the fd from the socket failed.

To decide whether this should be considered a programming bug or a
runtime error, the basic test is whether on getting this error, you
would fix the app to never cause it (that means it's a programming
error) or fix the app to handle it (that means it's a runtime error).
You would not fix an app to handle SEGV for example, because it's a
programming error. See also
http://library.gnome.org/devel/glib/unstable/glib-Error-Reporting.html
for discussion of this.

> I'd have voted for FALSE. Now, you seem to suggest FALSE is for OOM
> only and that's what matters.

Handling OOM requires the caller to know that OOM occurred, because
usually you have to roll back a "transaction" and retry. While for
most other errors, the retry is either done differently, or not
sensible at all.

The need to distinguish OOM is not some irrational arbitrary thing, I
assure you that OOM must be separately detectable to write an
OOM-handling application and that such an application would break if
you overloaded this return value. Basically the app would keep waiting
for memory pressure to be relieved and retrying to send the message,
but it would always fail, even though memory had been freed up.

Havoc