bug on dbus head

David Zeuthen david@fubar.dk
Wed, 15 Oct 2003 20:42:13 +0200


On Wed, 2003-10-15 at 20:16, Havoc Pennington wrote:
> On Wed, 2003-10-15 at 11:52, David Zeuthen wrote:
> > with current d-bus head an application can make the system message bus,
> > dbus-daemon-1, occupy 100% CPU and not respond.
> 
> Can you get a backtrace of where it's stuck? Or track down further the
> exact ChangeLog that broke it? (clearly mine of course)
> 

I hit Ctrl+C when it's spinning. I have commented out fork and
run-as-user in the config file.

[root@laptop bus]#
[root@laptop bus]# rm -f /usr/local/var/run/messagebus.pid
[root@laptop bus]# gdb ./dbus-daemon-1
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux-gnu"...
(gdb) run --system
Starting program: /home/david/xdg-hacking/dbus/bus/dbus-daemon-1
--system
 
Program received signal SIGINT, Interrupt.
0x42074bcd in malloc_consolidate () from /lib/tls/libc.so.6
(gdb) backtrace
#0  0x42074bcd in malloc_consolidate () from /lib/tls/libc.so.6
#1  0x42074249 in _int_malloc () from /lib/tls/libc.so.6
#2  0x42074e31 in _int_realloc () from /lib/tls/libc.so.6
#3  0x420738c4 in realloc () from /lib/tls/libc.so.6
#4  0x0807c3ba in dbus_realloc (memory=0x42132320, bytes=88)
    at dbus-memory.c:589
#5  0x080813cd in reallocate_for_length (real=0x80b4124,
new_length=2048)
    at dbus-string.c:393
#6  0x08081442 in set_length (real=0x80b4124, new_length=2048)
    at dbus-string.c:419
#7  0x08089d83 in _dbus_read (fd=6, buffer=0x80b4124, count=2048)
    at dbus-sysdeps.c:194
#8  0x08072a0c in do_reading (transport=0x80b40a0) at
dbus-transport-unix.c:658
#9  0x08072b83 in unix_handle_watch (transport=0x80b40a0,
watch=0x80b46c0,
    flags=1) at dbus-transport-unix.c:739
#10 0x080716aa in _dbus_transport_handle_watch (transport=0x80b40a0,
    watch=0x80b46c0, condition=1) at dbus-transport.c:581
#11 0x0805d767 in _dbus_connection_handle_watch (watch=0x80b46c0,
condition=1,
    data=0x80b4260) at dbus-connection.c:1055
#12 0x08073b39 in dbus_watch_handle (watch=0x80b46c0, flags=1)
    at dbus-watch.c:592
#13 0x0808ee30 in _dbus_loop_iterate (loop=0x80b44d8, block=1)
    at dbus-mainloop.c:783
#14 0x0808f10b in _dbus_loop_run (loop=0x80b44d8) at dbus-mainloop.c:847
#15 0x0805b63d in main (argc=2, argv=0xbfffeb64) at main.c:304
#16 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6

Hmm.. looks like memory corruption issues messing up glib (my system is
stock RHL9 + some updates). 

Just before spinning, valgrind says "Invalid free()"..

[root@laptop bus]# rm -f /usr/local/var/run/messagebus.pid
[root@laptop bus]# valgrind ./dbus-daemon-1 --system
==28527== Memcheck, a.k.a. Valgrind, a memory error detector for
x86-linux.
==28527== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
==28527== Using valgrind-20030725, a program supervision framework for
x86-linux.
==28527== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
==28527== Estimated CPU clock rate is 651 MHz
==28527== For more details, rerun with: -v
==28527==
==28527== Invalid free() / delete / delete[]
==28527==    at 0x400296BF: free (vg_replace_malloc.c:220)
==28527==    by 0x807C448: dbus_free (dbus-memory.c:632)
==28527==    by 0x805134A: check_pending_reply_data_free
(connection.c:1607)
==28527==    by 0x8051537: cancel_hook_free (connection.c:1745)
==28527==    Address 0x412CD114 is 0 bytes inside a block of size 12
free'd
==28527==    at 0x400296BF: free (vg_replace_malloc.c:220)
==28527==    by 0x807C448: dbus_free (dbus-memory.c:632)
==28527==    by 0x8051AF6: connection_execute_transaction
(connection.c:1986)
==28527==    by 0x8051B2C: bus_transaction_execute_and_free
(connection.c:2004)
==28527==
==28527== Jump to the invalid address stated on the next line
==28527==    at 0xABCDEF: ???
==28527==    by 0x80779A9: _dbus_list_foreach (dbus-list.c:797)
==28527==    by 0x8051559: free_cancel_hooks (connection.c:1753)
==28527==    by 0x8051B53: bus_transaction_execute_and_free
(connection.c:2008)
==28527==    Address 0xABCDEF is not stack'd, malloc'd or free'd
Segmentation fault
[root@laptop bus]# 

And then valgrind crashes! Hope this helps.

Thanks,
David