thread-safety of counters

Adrian Szyndela adrian.s at samsung.com
Mon Feb 23 07:38:44 PST 2015


W dniu 23.02.2015 o 13:56, Simon McVittie pisze:
> On 23/02/15 11:23, Adrian Szyndela wrote:
>> There are two messages involved, but both point to the same counter
>> (connection->transport->live_messages).
> I thought this counter was meant to be protected by one of the
> connection's mutexes (it has three - dispatch_mutex, io_path_mutex and
> mutex - because libdbus tries to be all things to all people and
> support an assortment of unlikely threading models). But I could be
> wrong...

Call stack for unref is:
_dbus_counter_unref
dbus_message_cache_or_finalize
dbus_message_unref
<client; fun() in sample program>

Call stack for ref is:
_dbus_counter_ref
_dbus_message_add_counter
_dbus_transport_queue_messages
< ... transport functions ... >
_dbus_transport_do_iteration
_dbus_connection_do_iteration_unlocked
_dbus_connection_block_pending_call
dbus_pending_call_block
dbus_connection_send_with_reply_and_block
<client; fun() in sample program>

I do not see a place where any of connection mutexes could be used,
especially for unrefs.

>> As a conclusion, I suggest changing DBusCounter's refcount from int to
>> DBusAtomic, as in DBusMessage.
> This seems reasonable, but it would be good to know whether the
> counter is meant to be protected by a mutex. If so, why was that mutex
> not sufficient here? and if not, how do we avoid getting an incorrect
> value in the counter via concurrent access from different threads?

The rer/unref functions on an ARM are compiled to:
Dump of assembler code for function _dbus_counter_ref:

         counter->refcount += 1;

<+0>:    ldr    r2, [r0, #0]         ; read

<+2>:    adds    r2, #1              ; modify

<+4>:    str    r2, [r0, #0]         ; store

...


Dump of assembler code for function _dbus_counter_unref:

         counter->refcount -= 1;

<+0>:    ldr    r3, [r0, #0]         ; read

<+2>:    subs    r3, #1              ; modify

<+4>:    str    r3, [r0, #0]         ; store

...


The bug appears when context is switched in "unrefing" thread after
reading, but before storing. If increment and decrement operations
become atomic, we should be safe.




More information about the dbus mailing list