<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi,</p>
<p>could be similar to</p>
<p><a class="moz-txt-link-freetext" href="https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/issues/112">https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/issues/112</a></p>
<p>Best regards,</p>
<p>Martin</p>
<p><br>
</p>
<div class="moz-cite-prefix">Am 31.07.2024 um 18:01 schrieb
Aleksander Morgado:<br>
</div>
<blockquote type="cite"
cite="mid:CAAFgFrWrRmg2htL+q2p0tcPEDobmozSQe66NSWrastQW1Eszpw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">Hey,</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Jul 31, 2024 at
5:14 PM Eduard Strehlau <<a
href="mailto:eduard@lionizers.com" moz-do-not-send="true"
class="moz-txt-link-freetext">eduard@lionizers.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div>Hi, <br>
I have observed some lockups with qmi-proxy and I think
I narrowed down the issue.<br>
<br>
Replicatable setup:<br>
1. Start an instance of qmi-proxy.<br>
2. Run multiple concurrent qmicli commands with -p for
example:<br>
"qmicli -p -d /dev/cdc-wdm0
--device-open-version-info --dms-noop & qmicli -p -d
/dev/cdc-wdm0 --device-open-version-info --dms-noop
& qmicli -p -d /dev/cdc-wdm0
--device-open-version-info --dms-noop"<br>
The following or similar error should appear:<br>
"error: couldn't create client for the 'dms' service:
CID allocation failed in the CTL client: Transaction
timed out"<br>
If the error does not appear kill the running qmi-proxy
and repeat the steps.<br>
<br>
Probable cause:<br>
There is a race condition that allows the same
qmi-devices to be open multiple times if proxy requests
are made in quick succession.<br>
This race condition is known and handled:<br>
<a
href="https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/blob/main/src/libqmi-glib/qmi-proxy.c#L371"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/blob/main/src/libqmi-glib/qmi-proxy.c#L371</a><br>
<br>
Sadly this handling does not seem to be enough, I have
observed qmi responses to arrive on the transiently
opened device.<br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>Oh, that is bad indeed, and needs to be handled.</div>
<div> </div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div>This causes breakdown of the transaction handling
(all requests get a timeout) and eventual hangs of
qmi-proxy<br>
where it does not respond to exit signals anymore (I am
not sure how, since I am not familiar with the libqmi
code).<br>
<br>
I have made an strace of qmi-proxy and attached a full
version for further debugging, but the behaviour
described above is visible:<br>
Abbreviated trace (...) first number is line number
second PID:<br>
1297 5436 openat(AT_FDCWD, "/dev/cdc-wdm0",
O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 8<br>
1433 5436 openat(AT_FDCWD, "/dev/cdc-wdm0",
O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 10<br>
1577 5436 openat(AT_FDCWD, "/dev/cdc-wdm0",
O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK|O_LARGEFILE) = 12<br>
1676 5436 write(8, "\1\v\0\0\0\0\0\1!\0\0\0", 12) =
12<br>
1697 5436 write(8, "\1\v\0\0\0\0\0\2!\0\0\0", 12) =
12<br>
1760 5436 read(8, "\1\204\0\200\0\0\1\1!\0y" ... ,
2048) = 133<br>
1784 5436 read(12, "\1\204\0\200\0\0\1\2!\0y" ... ,
2048) = 133<br>
1830 5436 write(8, "\1\v\0\0\0\0\0\3!\0\0\0", 12) =
12<br>
1850 5436 write(8, "\1\v\0\0\0\0\0\4!\0\0\0", 12) =
12<br>
1880 5436 read(8, "\1\204\0\200\0\0\1\3!\0y" ... ,
2048) = 133<br>
1890 5436 write(1, ... "[30 Apr 2024, 12:05:52]
[Debug] [/dev/cdc-wdm0] No transaction matched in
received message\n, 4096) = 4096<br>
<br>
In my opinion this is a fairly severe bug since it
defeats the entire existence of qmi-proxy, it cannot be
reliably used to concurrently use a qmi device.<br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>The issue happens exclusively when the proxy is trying to
open the control port for the first time, right? I don't
think we have observed this in the wild too much, e.g. when
using ModemManager, because in this case there is a single
process trying to open the proxy connection initially, not
more than one.</div>
<div><br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div>A proper fix would probably prevent the same device
being open multiple times in the first place.<br>
I would love to open a pull request, but since I am not
a glib developer and it is not trivial to do I cannot
justify it.<br>
<br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>I understand.</div>
<div><br>
</div>
<div>I was going to ask you open an issue in gitlab, but I see
you already did that at <a
href="https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/issues/113"
moz-do-not-send="true" class="moz-txt-link-freetext">https://gitlab.freedesktop.org/mobile-broadband/libqmi/-/issues/113</a>
Thanks for that.</div>
<div><br>
</div>
<div>Let me see how we can solve this.</div>
<div>Thanks!</div>
</div>
<div><br>
</div>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>Aleksander</div>
</div>
</div>
</div>
</blockquote>
</body>
</html>