[PATCH] port-serial: don't close tty file descriptor twice
Aleksander Morgado
aleksander at aleksander.es
Wed Jun 20 12:56:12 UTC 2018
Hey,
On Wed, Jun 20, 2018 at 12:25 AM, Brian, Sam <Samuel.Brian at digi.com> wrote:
> After the tty is opened, the fd is passed off to a g_io_channel which
> manages it. When the serial port is closed with _close_internal(),
> g_io_channel_shutdown() is called which then calls g_io_unix_close() which
> in turn close()s the fd.
> Currently, _close_internal() performs a second explicit close() after the
> g_io_channel shutdown which *in the best case scenario* will fail with EBADF.
> This is a race condition that can result in other file descriptors being
> closed.
>
> This change avoids double closing the serial port fd.
> ---
>
> Hey,
> I had this problem with a Telit LE910-EU1. What do you think of the fix?
>
> To give some more detail about the problem...
>
> The g_io_channel_shutdown() calling close() can be verified with
> this short program:
>
> $ cat test.c
> #include <glib.h>
> #include <errno.h>
> #include <stdio.h>
> #include <unistd.h>
>
> int main (int argc, char* argv[])
> {
> int ret;
> GIOChannel *channel;
>
> channel = g_io_channel_unix_new(STDIN_FILENO);
> g_io_channel_shutdown(channel, FALSE, NULL);
>
> ret = close(STDIN_FILENO);
> printf("close() -> %d: %m\n", ret);
> return 0;
> }
>
> $ gcc -o test test.c -g $(pkg-config --cflags --libs gio-2.0)
> $ ./test
> close() -> -1: Bad file descriptor
>
> Here is a snippet of strace from ModemManager that demonstrates when the
> race causes another fd to be closed:
>
> ### Close the tty (from g_io_channel_shutdown)
> [pid 413] close(9 <unfinished ...>
> [pid 687] <... poll resumed> ) = 1 ([{fd=7, revents=POLLIN}])
> ...
> ### tty's closed fd is reused by eventfd2
> [pid 687] SYS_356(0, 0x80800, 0xc6c4c8, 0, 0xc6c4d0) = 9
> ...
> [pid 687] poll([{fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=9, events=POLLIN}], 3, -1 <unfinished ...>
> ### First close succeeds (tty fd)
> [pid 413] <... close resumed> ) = 0
> ### Second (explicit) close succeeds (eventfd fd, oops)
> [pid 413] close(9) = 0
>
> PID 413 is ModemManager, PID 687 is a clone process created with the flag
> CLONE_FILES so the fd table is shared between parent and clone.
> From the clone's read/writes, it appears to be used to communicate with
> dbus.
>
> Strangely, closing the eventfd didn't always cause a noticable failure.
> It was only when *somehow* the port-serial code tried to use an eventfd as
> a tty and AT commands fail.
>
> src/mm-port-serial.c | 24 +++++++++++++-----------
> 1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/src/mm-port-serial.c b/src/mm-port-serial.c
> index 44699a76..9ba4e5d2 100644
> --- a/src/mm-port-serial.c
> +++ b/src/mm-port-serial.c
> @@ -1262,6 +1262,12 @@ success:
> error:
> mm_warn ("(%s) failed to open serial device", device);
>
> + if (self->priv->fd >= 0) {
> + if (!self->priv->iochannel)
> + close (self->priv->fd);
> + self->priv->fd = -1;
> + }
> +
> if (self->priv->iochannel) {
> g_io_channel_shutdown (self->priv->iochannel, FALSE, NULL);
> g_io_channel_unref (self->priv->iochannel);
> @@ -1274,11 +1280,6 @@ error:
> self->priv->socket = NULL;
> }
>
> - if (self->priv->fd >= 0) {
> - close (self->priv->fd);
> - self->priv->fd = -1;
> - }
> -
> return FALSE;
> }
>
> @@ -1348,6 +1349,13 @@ _close_internal (MMPortSerial *self, gboolean force)
> tcflush (self->priv->fd, TCIOFLUSH);
> }
>
> + /* Close fd if not part of the iochannel (shutdown closes fd) */
> + if (self->priv->fd >= 0) {
> + if (!self->priv->iochannel)
> + close (self->priv->fd);
> + self->priv->fd = -1;
> + }
> +
> /* Destroy channel */
> if (self->priv->iochannel) {
> data_watch_enable (self, FALSE);
> @@ -1356,12 +1364,6 @@ _close_internal (MMPortSerial *self, gboolean force)
> self->priv->iochannel = NULL;
> }
>
> - /* Close fd, if any */
> - if (self->priv->fd >= 0) {
> - close (self->priv->fd);
> - self->priv->fd = -1;
> - }
> -
> /* Destroy socket */
> if (self->priv->socket) {
> data_watch_enable (self, FALSE);
I believe the analysis is good indeed. When we create a GIOChannel
from a FD, there is nothing in the documentation saying that
g_io_channel_shutdown() won't close the underlying FD. There is an
explicit mention to g_io_channel_unref() not implicitly closing it,
but nothing about shutdown. Following the GLib code, I see how
g_io_channel_shutdown() will call the "io_close" method in the
GIOFuncs table, which for the unix channel will run g_io_unix_close(),
always doing an explicit close().
Why not change the logic and instead of not closing when there is no
iochannel, we just remove the g_io_channel_shutdown() calls? We know
that g_io_channel_unref() won't do any implicit close, so we can
safely destroy the iochannel regardless of whether we leave the
underlying FD open, and then close the FD with close() ourselves
unconditionally. That would probably be easier to understand as well?
Instead of the shutdown() call we could place a big fat /* comment */
explaining why we don't call shutdown.
What do you think?
--
Aleksander
https://aleksander.es
More information about the ModemManager-devel
mailing list