[systemd-devel] refcount of bus returned from sd_bus_default_system

Lennart Poettering lennart at poettering.net
Thu May 9 15:02:42 UTC 2019


On Mi, 08.05.19 22:50, Stanislav Angelovič (angelovic.s at gmail.com) wrote:

> Heya,
>
> when writing sdbus-c++, we've observed that sd_bus_default_system function
> called in a fresh new thread returns a bus with initial ref count 2. We
> built our code upon that assumption -- we had to unref the bus twice when
> the thread local storage got freed, otherwise we'd have gotten
> memory leaks.

Uh, no, this was never the right thing to do.

> Now it broke, however, because in systemd v242, the initial ref count is 1.
> Is that a conscious change? Can we build upon the guarantee that it will
> always be 1? (1 actually seems reasonable, unlike 2).

Originally, in sd-bus each bus message (i.e. each sd_bus_message
object) keeps a ref on the bus connection (i.e. each sd_bus object) it
belongs to. Moreover, if a message is queued into a bus connection it
is referenced by it. Thus you have a ref cycle, as long as a mesage is
queued into a bus and not processed yet. The intention then was that
whenever you want to get rid of a bus at the end of your
thread/process you'd flush out the bus anyway (because otherwise you
lose queued but unwritten messages), at which point the cycle would be
cleared up as soon as all messages are written.

Also note, that when a bus connection is set up a "Hello" message is
immediately enqueued into it (this is required by the spec). Thus, by
default there'd be two references onto the bus connection object: the
one you own yourself as the caller, and the one from the Hello bus
message queued onto it.

To flush out messages you should use the sd_bus_flush() call. it's
blocking, and all it does is force out all queued outgoing messages
that are not written yet, thus emptying the write queue. Then, call
sd_bus_close(), to explicitly terminate the connection. This will
empty the read queue too. At this point there are no messages queued
anymore, i.e. all refs on the bus object must be held by you and not
internally. Finally, get rid of the whole thing by doing
sd_bus_unref(). Since these three calls are often combined, there's
sd_bus_flush_close_unref() to combine them into one.

Calling sd_bus_flush() is entirely optional btw, it's sufficient to just
call sd_bus_close(). However in that case and pending unwritten
messages are just forgotten, and not written to the connection socket,
and never seen by the receiving side.

Now you might wonder, why doesn't sd_bus_unref() always implicitly
call sd_bus_flush()? Reffing/unreffing should not be blocking, that's
why. Moreover, the concept of "default" busses of a thread is that
various bits and pieces running in the thread could quickly ask for
the default bus connection, do some stuff with it, enqueue a message
or two, then unref it again. Other code might then pick it up again,
enqueue some more messages on them, unref it again, and so on. Then,
when the thread is about to end, there might be a number of messages
queued: before exiting from the thread they should be flushed out, but
only then, there's no need to do so earlier. Thus, in the earlier
uses of the bus connection your'd just unref the bus at the end, but
when you are finally done with the default connection altogether youd'
use the more powerful sd_bus_flush_close_unref() instead, and do all
the work synchronously, that the earlier users didn't bother to wait
for.

Note that when an sd_bus connection is attached to an sd_event event
loop, the sd_bus_flush() + sd_bus_close() is done automatically when
the "exit" phase of the event loop is entered.

Now, as it turns out, many users of sd-bus didn't get all of the above
right: they simply unreffed the bus after use, and enver bothered to
flush out any queued messages not written yet, and thus leaked
memory. With 1b3f9dd759ca0ea215e7b89f8ce66d1b724497b9 that was
addressed: now sd-bus tries to be a bit smarter than before: if it
notices, that a bus connection is only referenced by a message queued
into it but nothing else, and that there's no chance that it can be
referenced again, we'll detect this and release the entire object
automatically, knowing that it could never correctly be flushed out
anyway, since noone owns a ref to it anymore. Or in other words: the
ref cycles are now detected, and dealt with.

What does this mean for users? Well, if you are using sd-bus correctly
not much changes: make sure you flush/close your connections at the
end of each thread, to make sure you don't lose messages you enqueued
but that were never written out. If you don't do that correctly though
it's not a memory leak anymore as before if you forget to do that.

Lennart

--
Lennart Poettering, Berlin


More information about the systemd-devel mailing list