Getting repeatable connection problem after using an event loop

Mon Oct 5 16:32:51 UTC 2020

On Sa, 03.10.20 08:28, Jimi Damon (jdamon at gmail.com) wrote:

> Hi,

(I figure if you are looking for support for sd-bus better use the
systemd ML rather than the dbus ML. The latter should probably only
used if you have questions about the reference implementation or the
spec, but not for other implementations of it)

>         }
>
>         return EXIT_SUCCESS;

EXIT_SUCCESS is a strange return code for a bus message match handler. It's
typically something you'd return from a main() function. This resolves
to zero though, which OTOH is actually an OK thing to return.

>     }
>
> /* Wait for signal */
> int32_t
> wait_for_signal(sd_bus* bus, char* match, uint64_t time_microseconds)
> {
>
>     int32_t r;
>     int32_t found          = 0;
>     waiting_control_t data = { .slot = NULL, .found = 0 };
>     sd_bus_error err       = SD_BUS_ERROR_NULL;
>
>     sd_bus_add_match(bus, &data.slot, match, tcb, &data);

note that while it might be ok to ignore errors for test code, in
production code you really shoudl check error codes here, as this call
might fail for example if the bus connection is has been terminated.

>     do
>     {
>         r = sd_bus_process(bus, NULL);
>         if (r < 0)
>         {
>             fprintf(stderr, "Failed to process bus: %s\n", strerror(-r));
>             break;

this error path is borked, since if this fails before the handler is
called at least once, then you'l leak the slot here, and it will have
a userdata ptr that points to this stack frame, which will then become
invalid memory.

>         }

As documented the usual pattern is to call ingo sd_bus_wait() here
only if the return value of sd_bus_process() equals zero (which means:
there was nothing to do, and we should wait for some IO event). If the
return value is > 0 then there was something to do, and you can
immediately call sd_bus_process() again.

That said, calling sd_bus_wait() in all cases is fine too, because it
actually becomes a NOP if there's still something to do.

>         r = sd_bus_wait(bus, 1000);

this can fail too...

>     } while (!data.found && (time_microseconds-=1000) > 0);
>
>     if (data.found)
>         return 0;
>     else
>         return -1;
> }
>
> /*Other API function that returns sees -22 on sd_bus_set_property */
> int logging_client_set_log_path(char *path)
> {
>     int32_t r;
>     logging_client_bus_struct_t b;
>     sd_bus_message* msg;
>     sd_bus_error error;
>     logging_client_bus_init(&b);
>     const char *s;
>     r = logging_client_bus_open(&b);
>     if (r < 0)
>     {
>         printf("Failed to open log_client bus\n");
>         return r;
>     }
>     r = sd_bus_set_property(b.bus,
>                             LOG_DESTINATION_NAME,
>                             LOG_OBJECT_PATH,
>                             LOG_INTERFACE_NAME,
>                             "logPath",
>                             &b.client_bus_error,
>                             "s",
>                             path
>                             );
>     return r;
> }
>
>
> The problem I have is that if I call another API function
> (logging_client_set_log_path) AFTER I have called this function, then the
> second API call (which sets a property)  ALWAYS fails with a -22 return
> code (coming from sd_bus_set_property). However, if I don't call this
> wait_for_signal API call, then my other API function seems to always
> succeed.

make sure you initialize all input parameters properly. Consider using
valgrind to see if something is accessed that is not properly
initialized.

sd-bus typically returns EINVAL when you pass invalid params in.

> Through all of this testing I can always run "busctl set-property ..."
> commands to set the logPath property . Hence I know that my server can
> correctly see these set-property commands and leads me to believe  that I'm
> not cleaning up structures in the function wait_for_signal correctly.
>
>
> In addition, there was something weird about my code. I had to add this
>
> if (ud) { .. }
>
> block because my callback function was getting called twice. I did not
> expect this since  I thought the first
>
> sd_bus_slot_unref(ud->slot);
>
> would have disabled the callback function.

This will dsiable it only if there's only a single ref left. mAybe you
have some ref somewhere else?

Lennart

--
Lennart Poettering, Berlin