qmi-proxy lockup, transaction timeouts

Martin Maurer martin.maurer at mmeacs.de
Wed Jul 31 16:36:05 UTC 2024


Hi,

could be similar to

https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/issues/112

Best regards,

Martin


Am 31.07.2024 um 18:01 schrieb Aleksander Morgado:
> Hey,
>
> On Wed, Jul 31, 2024 at 5:14 PM Eduard Strehlau <eduard at lionizers.com> 
> wrote:
>
>     Hi,
>     I have observed some lockups with qmi-proxy and I think I narrowed
>     down the issue.
>
>     Replicatable setup:
>     1. Start an instance of qmi-proxy.
>     2. Run multiple concurrent qmicli commands with -p for example:
>        "qmicli -p -d /dev/cdc-wdm0 --device-open-version-info
>     --dms-noop & qmicli -p -d /dev/cdc-wdm0 --device-open-version-info
>     --dms-noop & qmicli -p -d /dev/cdc-wdm0 --device-open-version-info
>     --dms-noop"
>     The following or similar error should appear:
>        "error: couldn't create client for the 'dms' service: CID
>     allocation failed in the CTL client: Transaction timed out"
>     If the error does not appear kill the running qmi-proxy and repeat
>     the steps.
>
>     Probable cause:
>     There is a race condition that allows the same qmi-devices to be
>     open multiple times if proxy requests are made in quick succession.
>     This race condition is known and handled:
>     https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/blob/main/src/libqmi-glib/qmi-proxy.c#L371
>
>     Sadly this handling does not seem to be enough, I have observed
>     qmi responses to arrive on the transiently opened device.
>
>
> Oh, that is bad indeed, and needs to be handled.
>
>     This causes breakdown of the transaction handling (all requests
>     get a timeout) and eventual hangs of qmi-proxy
>     where it does not respond to exit signals anymore (I am not sure
>     how, since I am not familiar with the libqmi code).
>
>     I have made an strace of qmi-proxy and attached a full version for
>     further debugging, but the behaviour described above is visible:
>     Abbreviated trace (...) first number is line number second PID:
>       1297  5436  openat(AT_FDCWD, "/dev/cdc-wdm0",
>     O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 8
>       1433  5436  openat(AT_FDCWD, "/dev/cdc-wdm0",
>     O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 10
>       1577  5436  openat(AT_FDCWD, "/dev/cdc-wdm0",
>     O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 12
>       1676  5436  write(8, "\1\v\0\0\0\0\0\1!\0\0\0", 12) = 12
>       1697  5436  write(8, "\1\v\0\0\0\0\0\2!\0\0\0", 12) = 12
>       1760  5436  read(8, "\1\204\0\200\0\0\1\1!\0y" ... , 2048) = 133
>       1784  5436  read(12, "\1\204\0\200\0\0\1\2!\0y" ... , 2048) = 133
>       1830  5436  write(8, "\1\v\0\0\0\0\0\3!\0\0\0", 12) = 12
>       1850  5436  write(8, "\1\v\0\0\0\0\0\4!\0\0\0", 12) = 12
>       1880  5436  read(8,  "\1\204\0\200\0\0\1\3!\0y" ... , 2048) = 133
>       1890 5436  write(1, ... "[30 Apr 2024, 12:05:52] [Debug]
>     [/dev/cdc-wdm0] No transaction matched in received message\n,
>     4096) = 4096
>
>     In my opinion this is a fairly severe bug since it defeats the
>     entire existence of qmi-proxy, it cannot be reliably used to
>     concurrently use a qmi device.
>
>
> The issue happens exclusively when the proxy is trying to open the 
> control port for the first time, right? I don't think we have observed 
> this in the wild too much, e.g. when using ModemManager, because in 
> this case there is a single process trying to open the proxy 
> connection initially, not more than one.
>
>     A proper fix would probably prevent the same device being open
>     multiple times in the first place.
>     I would love to open a pull request, but since I am not a glib
>     developer and it is not trivial to do I cannot justify it.
>
>
> I understand.
>
> I was going to ask you open an issue in gitlab, but I see you already 
> did that at 
> https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/issues/113 
> Thanks for that.
>
> Let me see how we can solve this.
> Thanks!
>
> -- 
> Aleksander
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libqmi-devel/attachments/20240731/786fe874/attachment-0001.htm>


More information about the libqmi-devel mailing list