qmi-proxy lockup, transaction timeouts
Eduard Strehlau
eduard at lionizers.com
Thu May 2 10:18:47 UTC 2024
Hi,
I have observed some lockups with qmi-proxy and I think I narrowed down the
issue.
Replicatable setup:
1. Start an instance of qmi-proxy.
2. Run multiple concurrent qmicli commands with -p for example:
"qmicli -p -d /dev/cdc-wdm0 --device-open-version-info --dms-noop &
qmicli -p -d /dev/cdc-wdm0 --device-open-version-info --dms-noop & qmicli
-p -d /dev/cdc-wdm0 --device-open-version-info --dms-noop"
The following or similar error should appear:
"error: couldn't create client for the 'dms' service: CID allocation
failed in the CTL client: Transaction timed out"
If the error does not appear kill the running qmi-proxy and repeat the
steps.
Probable cause:
There is a race condition that allows the same qmi-devices to be open
multiple times if proxy requests are made in quick succession.
This race condition is known and handled:
https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/blob/main/src/libqmi-glib/qmi-proxy.c#L371
Sadly this handling does not seem to be enough, I have observed qmi
responses to arrive on the transiently opened device.
This causes breakdown of the transaction handling (all requests get a
timeout) and eventual hangs of qmi-proxy
where it does not respond to exit signals anymore (I am not sure how, since
I am not familiar with the libqmi code).
I have made an strace of qmi-proxy and attached a full version for further
debugging, but the behaviour described above is visible:
Abbreviated trace (...) first number is line number second PID:
1297 5436 openat(AT_FDCWD, "/dev/cdc-wdm0",
O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 8
1433 5436 openat(AT_FDCWD, "/dev/cdc-wdm0",
O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 10
1577 5436 openat(AT_FDCWD, "/dev/cdc-wdm0",
O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 12
1676 5436 write(8, "\1\v\0\0\0\0\0\1!\0\0\0", 12) = 12
1697 5436 write(8, "\1\v\0\0\0\0\0\2!\0\0\0", 12) = 12
1760 5436 read(8, "\1\204\0\200\0\0\1\1!\0y" ... , 2048) = 133
1784 5436 read(12, "\1\204\0\200\0\0\1\2!\0y" ... , 2048) = 133
1830 5436 write(8, "\1\v\0\0\0\0\0\3!\0\0\0", 12) = 12
1850 5436 write(8, "\1\v\0\0\0\0\0\4!\0\0\0", 12) = 12
1880 5436 read(8, "\1\204\0\200\0\0\1\3!\0y" ... , 2048) = 133
1890 5436 write(1, ... "[30 Apr 2024, 12:05:52] [Debug] [/dev/cdc-wdm0]
No transaction matched in received message\n, 4096) = 4096
In my opinion this is a fairly severe bug since it defeats the entire
existence of qmi-proxy, it cannot be reliably used to concurrently use a
qmi device.
A proper fix would probably prevent the same device being open multiple
times in the first place.
I would love to open a pull request, but since I am not a glib developer
and it is not trivial to do I cannot justify it.
--
Eduard Strehlau
Lionizers GmbH
Bremer Straße 18
21073 Hamburg
Telefon:
+49 40 325084‑23
Webseite:
lionizers.com
LinkedIn:
lionizers <https://www.linkedin.com/company/lionizers>
🦁🧡 Wir bringen Digitalisierung und Nachhaltigkeit zusammen. #lionisieren
Sitz: Hamburg, Amtsgericht Hamburg HRB 143339
Geschäftsführer: Nils Löwe, Friederike Löwe
USt-IdNr. gemäß §27a Umsatzsteuergesetz: DE308701380 - Finanzamt Hamburg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libqmi-devel/attachments/20240502/082a4489/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: proxy_trace_numbered
Type: application/octet-stream
Size: 220134 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/libqmi-devel/attachments/20240502/082a4489/attachment-0001.obj>
More information about the libqmi-devel
mailing list