long (25 seconds) pause when starting hal-device-manager

Sat Jan 21 17:20:33 PST 2006

On Sat, 2006-01-21 at 19:51 -0500, David Zeuthen wrote:
> On Sat, 2006-01-21 at 15:34 -0800, Jeffrey W. Baker wrote:
> > Hi,
> > 
> > I have dbus 0.60, hal 0.5.6, and udev 079 on Linux 2.6.15.  When I start
> > hal-device-manager there is a 25-second pause before anything happens.
> > According to the strace, during the pause both hal-device-manager and
> > the system dbus daemon are in poll(), waiting for the other to say
> > something, basically.
> 
> I think this is a well-known problem with introspection and the python
> bindings. John?

Well known or not, it's kinda obnoxious :)

The most obvious place to start looking is in the strace... you can see
that the 25s pause is an artifact of entering poll().

4365  1137884518.206061 poll([{fd=6, events=POLLIN}], 1, 25000) = 0 <24.994242>

Problem is, we've entered poll() with POLLIN, but we actually have
something to write, and the dbus daemon is waiting for us to write it.
In fact the very next thing hal-device-manager does is a writev().

4365  1137884543.200881 writev(6, [{..., 152}, {"", 0}], 2) = 152 <0.000048>

We should not have entered poll() with POLLIN if we have something to
write which the peer is waiting to read.  That is the classic definition
of a deadlock.

Now, how we got into that particular situation, I don't know.  Somehow
we get into _dbus_connection_do_iteration_unlocked() (from
_dbus_connection_block_pending_call(), from dbus_pending_call_block(),
from python) with flags = 6 when we want flags = 7.

The bug is either that the inner loop tries to read when it should
write, or the bug is at a higher level with the same effect.  Beyond
that I'm afraid I can't be of much help because this is my first look at
the dbus code.

Should we x-post this to the dbus list?

-jwb