Missing NULL checks - qmi-proxy.c

Aleksander Morgado aleksander at aleksander.es
Tue Dec 29 16:54:54 UTC 2020


Hey,

> > Is that backtrace obtained upon the crash? Looks like
> > qmi_client_info_array is not NULL in that backtrace
> >
>
> Sorry, obviously I can't "sit" on a crash, so that context is lost.  However,
> I'm chasing a 4th crash, which I've yet to catch in the act.   But I went
> back and took out the patches to recreate this.  I hope this is enough
> information this time:
>
>
>
> Thread 1 "qmi-proxy" received signal SIGSEGV, Segmentation fault.
> 0x77c93d41 in qmi_client_info_array_lookup_cid (array=0x0,
> service=QMI_SERVICE_LOC, cid=2 '<error reading variable>) at qmi-proxy.c:544
> 544     qmi-proxy.c: No such file or directory.
> (gdb) bt
> #0  0x77c93d41 in qmi_client_info_array_lookup_cid (array=0x0,
> service=QMI_SERVICE_LOC, cid=2 '<error reading variable>) at qmi-proxy.c:544
> #1  0x77c94581 in track_implicit_cid (self=0x77958a10, client=0x41c678,
> message=0x416ee0) at qmi-proxy.c:675
> #2  0x77c94e9b in process_message (self=0x77958a10, client=0x41c678,
> message=0x416ee0) at qmi-proxy.c:853
> #3  0x77c94fad in parse_request (self=0x77958a10, client=0x41c678) at
> qmi-proxy.c:905
> #4  0x77c9513f in connection_readable_cb (warning: GDB can't find the start of
> the function at 0x77b5717e.
>
> (gdb) return
> Make qmi_client_info_array_lookup_cid return now? (y or n) y
> #0  0x77c94581 in track_implicit_cid (self=0x77958a10, client=0x41c678,
> message=0x416ee0) at qmi-proxy.c:675
> 675     in qmi-proxy.c
> (gdb) print *self
> $1 = {parent = {g_type_instance = {g_class = 0x77a75a30}, ref_count = 2, qdata =
> 0x0}, priv = 0x77958a00}
> (gdb) print *client
> $2 = {ref_count = 2, proxy = 0x77958a10, connection = 0x77925d18,
> connection_readable_source = 0x77945e60, buffer = 0x77945b40, device = 0x418d00,
>    internal_proxy_open_request = 0x0, qmi_client_info_array = 0x77945b60,
> indication_id = 32, device_removed_id = 33}
>

No, I don't think that is printing the client struct that is
triggering the crash. The qmi_client_info_array is not NULL in that
one, which is fine.

>
> It's been suggested this is a transition case, where the CID might no longer be
> valid.  I'm not 100% on the code path that calls this, but it seems prudent
> that libqmi protects itself no matter what.
>

If we add those NULL checks and the problem is that we're reading the
struct once it's already been freed, we're really ignoring the
original problem and the invalid memory usage will bite us somewhere
else later on, maybe not as clear as in this case. I'm not against
adding NULL checks, but we shouldn't treat those as valid use cases if
the code path doesn't consider them valid. i.e. I would personally
make them g_assert()s if the logic assumes they should never happen,
not plain if() checks.

I'll try to review again the current logic to see how this may happen.

Do you have an idea of how this issue happens, like under which
circumstances? Is the crash happening when the modem resets itself, or
is the crash happening really randomly?

> In the meantime, ModemManager is also crashing, but there's some fixes in latest
> git that seem relevant.   I'm going to pull that, and then see if I can
> find the 4th crash in libqmi.

Which are those fixes in MM?

-- 
Aleksander
https://aleksander.es


More information about the libqmi-devel mailing list