E173s-1 stops working after weeks of uptime

Dan Williams dcbw at redhat.com
Fri May 31 02:46:01 UTC 2019


On Fri, 2019-05-31 at 00:01 +0200, Ladislav Michl wrote:
> On Thu, May 30, 2019 at 08:49:15PM +0200, Dario Nieuwenhuis wrote:
> > Hello,
> > 
> > We have some embedded devices deployed in the field using a huawei
> > E173s-1
> > modem. We're having some issues where connectivity stops working
> > randomly,
> > after 1-3 weeks of uptime.
> > 
> > First, there's a PPP disconnection. Then there's a few loops of
> > this error
> > for 1-2 seconds, rather fast:
> > 
> >     failed to connect modem: Couldn't connect: cannot keep data
> > port
> > open.Could not open serial device ttyUSB0: reopen operation in
> > progress
> > 
> > Afterwards, there are 10 attempts of reconnecting, with "at port
> > timed out
> > X consecutive times" errors. After 10 attempts, ModemManager gives
> > up
> > permanently, and never tries again to bring the device back up.
> > 
> >     (tty/ttyUSB0) at port timed out 10 consecutive times, marking
> > modem
> > '/org/freedesktop/ModemManager1/Modem/0' as invalid
> > 
> > A few hours later, when we could get onsite, restarting
> > ModemManager and
> > NetworkManager brought the modem back online with no issues (with
> > no
> > unplug/replug/powercycle of the modem or the Linux board.) This
> > means the
> > modem is not irreversibly crashed/failed, so I think this is a
> > software
> > issue that should be fixable.
> > 
> > Unfortunately, ModemManager was set to INFO log level because we
> > though
> > DEBUG was too verbose. We have set DEBUG log level, so the next
> > time it
> > happens we will have more logs.
> > 
> > I would really appreciate any input you may have on how to solve
> > this.
> > - I thought of patching out the "max 10 timeouts" limit, so
> > ModemManager
> > keeps retrying indefinitely. Is this a good idea?
> > - What can I try so next time this happens we can get more info on
> > the
> > issue? (besides debug log level)
> > - Any recommendations in general on how to ensure the device is
> > always
> > connected? Any config knobs to tweak? We've been thinking of adding
> > a "if
> > no internet during 10 minutes, powercycle everything" watchdog, but
> > that
> > feels like giving up on getting this working properly.
> 
> I'm using MM&NM restart and if that doesn't help, then modem
> powercycle and
> then daemon restart. Boards do have modem power supply controlled by
> gpio,
> so that's easy to do, but rather hackish. Is there any plan to add
> some
> "modem hw reset" infrastructure here?

We've thought about it before, but the problem is that it's pretty
unreliable for USB ports on most machines that aren't embedded.
Basically, you cannot guarantee that the USB host controller supports
the command to power cycle a port, and a lot of them just don't.

You can't expect the USB port reset to work, because that's a USB
request to the firmware running on the device IIRC and if the device is
already hung it's surely not going to pay attention to a reset request.

So basically yes, that infrastructure could be added, but there's no
way you can depend on the request actually succeeding especially on x86
machines.

Dan



More information about the ModemManager-devel mailing list