sending error on corrupt data

Mon Nov 19 15:37:22 PST 2007

Hi,

I investigated a bit what was involved here; I thought it might be easy 
to just fix, but it wasn't _that_ easy, so I'll write down what would be 
involved in case someone has more time.

Problem Statement
===

Corruption (non-well-formedness) currently results in dropping the 
socket without sending any kind of error.

What would be better: attempt to send an error "blind" - i.e. without 
blocking (give up if buffer is full), and without doing any reading off 
the socket. Then disconnect as before.

In theory, virtually all disconnects here could be avoided through 
_dbus_return_if_fail() in libdbus. However, one or two of them perhaps 
can't be, and the libdbus checks don't help people reimplementing the 
protocol.

Code
===

The relevant code includes:

dbus-message.c:load_message() parses the message and sets the 
DBusMessageLoader::corrupted flag if the message is corrupted.

dbus-marshal-header.c:_dbus_header_load() parses the header 
specifically, and is most likely to encounter some kind of corruption 
problem. The message body can only be corrupt if someone's marshaling 
code is truly broken, while the header has other "higher level" 
well-formedness requirements like the presence of certain fields.

dbus-transport.c:_dbus_transport_queue_messages() has the logic that 
calls _dbus_transport_disconnect() if the loader is corrupted.

Problems
===

1) Sending an error requires a message serial to reply to. Otherwise, 
nobody will get the error. But the message was corrupted.
Possible solution: only send an error if the corruption is after we read 
the reply serial, in _dbus_header_load(). There are various possible 
corruptions before that point, but I don't know how to send errors for 
those.

2) Currently, all the code for writing out messages relies on pulling 
messages from the outgoing queue. It's an ugly layering violation to
call _dbus_connection_send_unlocked_no_update() from dbus-transport.c,
but I don't really have a better idea. It would be a mess to try to 
support writing out a message directly instead of putting it in the 
queue first.

3) Corruption errors are currently just an enum, so we'd need a strerror 
type thing for the enum.

Possible Alternate Solutions
===

- have some kind of new signal (or message type?) for "disconnect
   reason" so we don't need a reply serial - i.e. send a generic
   disconnect reason message instead of a reply to the bad message. This
   would solve problem 1)

- print a warning to stderr instead of trying to talk to the
   remote app (I don't like this idea much)

Havoc