OpenWrt: unexpected disconnect requiring manual restart of ModemManager
Aleksander Morgado
aleksander at aleksander.es
Sun Apr 26 12:48:21 UTC 2020
Hey!
> > Anyway, as you also said, having lost the AT port may indicate other
> > kinds of internal problems, so attempting to trigger a reset ourselves
> > wouldn't be that bad in this case? I'm open to suggestions on how to
> > best handle this. Until now ModemManager hasn't done too many control
> > things "by itself", but I understand this may be one of those things
> > that could be automatically handled. It wouldn't be a bad idea to
> > start using a configuration file that would allow disabling these
> > smarts if needed.
>
> For the record: I believe the current design where ModemManager takes
> few actions by itself, and just provides the framwork for e.g
> NetworkManager, is a very good one.
>
Yes, yes. That is also why we never needed a config file anyway,
because MM would never apply any logic by itself. The only exception
right now is the automatic carrier config selection in the DW5821e;
and I think we could still have more exceptions if the purpose of the
self-applied logic is keeping the modem alive and workable. This new
issue you report would fit that target I think.
> So, yes, I have some problems putting my finger exactly on what I
> believe is wrong here. Sorry about all the self contradictions while I
> try to figure out where i really want to go ;-)
>
> Maybe the "declare modem as invalid" is an action which should be
> delegated to some upper layer, if we are sticking to that design? You
> don't leave much choice to the ModemManager "user", as in the
> application controlling MM, when you just delete the modem like that.
> MM knows the modem wasn't disconnected from the USB bus, so it cannot
> expect any hotplug events in this situation. How should the controlling
> application realise that it is supposed to do something? And what can it
> do? Restart MM? On what trigger?
>
The idea of MM removing the modem from DBus must must must be ONLY if
the modem is no longer usable. That's it. When this happens, a restart
of ModemManager shouldn't recover the modem, because if that was the
case, we should have tried to do it ourselves without bothering
anyone. So, leaving upper layers a modem e.g. in "failed" state still
exposed, I don't think it's reasonable in this case under these
assumptions, because the modem is supposed to be "broken" and not
recoverable unless it is completely power cycled.
I'll give you two examples of why this logic is in place:
1) There are some USB modems (AT+NET) where the firmware inside the
device gets stuck and the AT ports end up no longer responding (and
the NET wwan port giving out errors like "transmit queue 0 timed
out"). If those devices reach that situation; that's it, they're dead
until manually unplugged/replugged (or power off/on if they have
external power control).
2) A RS232 modem connected to the host PC via a USB<->RS232 adapter.
If the user disconnects the RS232 modem without disconnecting the USB
adapter, we need to detect that the modem is no longer usable.
In both those cases, MM detects the silent AT ports, and ends up
removing the device from DBus once it assumes it is no longer usable.
Your use case is none of those, because in your use case you lost the
AT port while presumably the QMI/MBIM port was still usable. You lost
some features only, not the full modem control, and so we shouldn't
have flagged the modem as removed, it is definitely something to fix.
> I guess this leads to either
> 1) don't delete devices unless told to by e.g OS or managing
> application, or
Don't think we should do that, based on what I tried to explain above.
MM should only flag as removed from DBus those devices that are no
longer usable.
> 2) delete *and* take the responsibility for rediscovery of
> the deleted device
>
I believe we already do that, don't we? When flagged as failed and
removed from DBus, I think the device ports go a full reprobe phase,
or at least they should. I recall doing that years ago to handle some
RS232 over USB adapter usecases. If this is not happening right now,
we should definitely re-add it. Do you have full logs when the issue
happened to you? We can probably check whether that's the case or not
looking at the logs.
Anyway, I think there is still another option 3 to consider: whenever
possible, we shouldn't even take the step to cleanly delete and
shutdown the modem. If we detect e.g. as in your case, a silent AT
port but we still have means to reset the device, we should reset it
right away (not just reprobe the ports), like, without thinking it
much. The target is to have the modem as usable as possible with all
the features it had when we first detected it (so assuming that every
reset will give us the same set of features).
> I am fine with deletion on fatal error if we can trust there will an
> automatic rediscovery attempt. This might obviously fail in cases where
> a modem really did silently die without disconnecting from the bus. But
> we should give rediscovery at least one chance.
>
As said, the rediscovery (re-probing of all ports) without modem reset
should already be there in place. I think I should recheck that logic
to make sure it's happening.
> I don't have a strong opinion whether this rediscovery action should be
> handled by MM directly, or by the application using MM. But I do
> believe both delete and rediscovery must be initiated by same entity,
> since both actions are parts of the same decision.
I don't think the upper layers of ModemManager should be in charge of
anything; unless the upper layers are able to physically control the
power to the device. In most user setups, this is not the case, so the
current behavior of only removing modems from DBus when MM assumes
they're no longer usable should be good. In custom professional setups
where the host has access to the power of the modem through e.g. some
GPIO, the control of when to fully reset the device should be shared:
leave to MM as much control as possible to handle the device, but if
MM ends up removing the device from DBus, upper layers could detect it
and trigger the GPIO reset to bring it back again (always following
the manufacturers' advice on how to do that!)
--
Aleksander
https://aleksander.es
More information about the ModemManager-devel
mailing list