[RFC] TTY port locked problems when using PPP

Aleksander Morgado aleksander at aleksander.es
Tue Jun 18 12:43:10 UTC 2019


In the past months we've seen several reports of issues with the TTY
port "forced closed" and not being able to be reopened after that and
things like that. The root cause from this problem is that MM tries to
reuse the TTY port after MM itself has detected a disconnection (via
+CGEV URCs or +CGACT? polling). While trying to reuse the TTY, the
port is flagged as disconnected and reused, and so the TTY reading
method may end up getting a HUP because modem control lines are still
in effect (CLOCAL settings managed by pppd).

We have been discussing this on IRC and I think there are several ways
to try to solve this issue.

#1) The first way is an attempt to keep on using +CGEV and +CGACT?
polling and then do our best to reuse the port safely after we detect
a disconnection. This would mean that we try to reopen the port and
apply our settings even if pppd is still using the port. Also, we
could just ignore HUPs happening in the port in case we're using the
port with unexpected CLOCAL settings, and we would trigger a full port
reopen ourselves to set the settings we expect. The problem I see with
this solution is that we truly don't know what pppd is doing at the
same time; it may already be stopping after NM stop sees the bearer
object disconnected in MM, but we cannot be totally sure. Maybe the
procedure could be extended to check if other processes are using the
port before trying to reuse it ourselves (kind of lsof), but that
logic is not in place yet. The changes suggested for now are here:

#2) The second way is to skip using +CGEV URCs or +CGACT polling
completely if the data port is a TTY. This would recover the logic
that was in place before MM 1.8.0 (+CGACT polling was added in 1.8.0
and +CGEV URCs in 1.10.0). Without these two things, the bearer
disconnections would exclusively happen after NM (or whatever other
client) runs Disconnect() on the bearer object, and that would only
happen once the port is available for MM (i.e. pppd already stopped).
In this way, all MM/NM/pppd would be in sync again and MM would not
get any HUP in the TTY (as CLOCAL settings would be the expected
ones). But, we lose the ability to report the disconnection when the
context/bearer disconnection is detected with AT commands, and we
would rely only on pppd detecting the disconnection, as it was before
1.8.0. The changes suggested are here:

#3) A third way, for which there is no suggested MR yet would be to
keep +CGEV URCs and +CGACT polling, *but* instead of blindly reopening
the port to reuse it right away, only emit e.g. a DBus signal or
property update reporting that the bearer object "needs
disconnection", and so we would leave the task of running Disconnect()
to the upper layers (once the port is no longer used by pppd). MM
would not automatically try to reuse the port until that Disconnect()
is run by the client. This has the benefit that we keep all features
in place and we're sure we'll be in sync as well, but it would require
additional changes in NM. The new 'needs disconnection' signal could
be treated as completely optional, if other connection managers don't
want to use it, they can keep on relying on pppd-detected
disconnections exclusively without any additional change.

My opinion is a bit open on this, I liked the first way because we
would try to be as robust as possible and totally avoid using the port
with wrong CLOCAL settings, but I don't think we can have that
robustness unless we're truly in sync with NM and pppd, and for that
we need either #2 or #3. And if we target to be in sync, then I don't
think #1 is needed, we should always try to enforce being in sync
instead. I'd vote to have #2 in MM 1.8 and MM 1.10 and #3 for the next
MM 1.12?

Comments welcome!


More information about the ModemManager-devel mailing list