[PATCH] port-serial: don't close tty file descriptor twice

Brian, Sam Samuel.Brian at digi.com
Thu Jun 21 01:11:05 UTC 2018


Hi Aleksander,


On 20/06/18 22:56, Aleksander Morgado wrote:
> Hey,
> 
> On Wed, Jun 20, 2018 at 12:25 AM, Brian, Sam <Samuel.Brian at digi.com> wrote:
>> After the tty is opened, the fd is passed off to a g_io_channel which
>> manages it. When the serial port is closed with _close_internal(),
>> g_io_channel_shutdown() is called which then calls g_io_unix_close() which
>> in turn close()s the fd.
>> Currently, _close_internal() performs a second explicit close() after the
>> g_io_channel shutdown which *in the best case scenario* will fail with EBADF.
>> This is a race condition that can result in other file descriptors being
>> closed.
>>
>> This change avoids double closing the serial port fd.
>> ---
>>
>> Hey,
>> I had this problem with a Telit LE910-EU1. What do you think of the fix?
>>
>> To give some more detail about the problem...
>>
>> The g_io_channel_shutdown() calling close() can be verified with
>> this short program:
>>
>>    $ cat test.c
>>    #include <glib.h>
>>    #include <errno.h>
>>    #include <stdio.h>
>>    #include <unistd.h>
>>
>>    int main (int argc, char* argv[])
>>    {
>>      int ret;
>>      GIOChannel *channel;
>>
>>      channel = g_io_channel_unix_new(STDIN_FILENO);
>>      g_io_channel_shutdown(channel, FALSE, NULL);
>>
>>      ret = close(STDIN_FILENO);
>>      printf("close() -> %d: %m\n", ret);
>>      return 0;
>>    }
>>
>>    $ gcc -o test test.c -g $(pkg-config --cflags --libs gio-2.0)
>>    $ ./test
>>    close() -> -1: Bad file descriptor
>>
>> Here is a snippet of strace from ModemManager that demonstrates when the
>> race causes another fd to be closed:
>>
>>    ### Close the tty (from g_io_channel_shutdown)
>>    [pid   413] close(9 <unfinished ...>
>>    [pid   687] <... poll resumed> )        = 1 ([{fd=7, revents=POLLIN}])
>>    ...
>>    ### tty's closed fd is reused by eventfd2
>>    [pid   687] SYS_356(0, 0x80800, 0xc6c4c8, 0, 0xc6c4d0) = 9
>>    ...
>>    [pid   687] poll([{fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=9, events=POLLIN}], 3, -1 <unfinished ...>
>>    ### First close succeeds (tty fd)
>>    [pid   413] <... close resumed> )       = 0
>>    ### Second (explicit) close succeeds (eventfd fd, oops)
>>    [pid   413] close(9)                    = 0
>>
>> PID 413 is ModemManager, PID 687 is a clone process created with the flag
>> CLONE_FILES so the fd table is shared between parent and clone.
>>  From the clone's read/writes, it appears to be used to communicate with
>> dbus.
>>
>> Strangely, closing the eventfd didn't always cause a noticable failure.
>> It was only when *somehow* the port-serial code tried to use an eventfd as
>> a tty and AT commands fail.
>>
>>   src/mm-port-serial.c | 24 +++++++++++++-----------
>>   1 file changed, 13 insertions(+), 11 deletions(-)
>>
>> diff --git a/src/mm-port-serial.c b/src/mm-port-serial.c
>> index 44699a76..9ba4e5d2 100644
>> --- a/src/mm-port-serial.c
>> +++ b/src/mm-port-serial.c
>> @@ -1262,6 +1262,12 @@ success:
>>   error:
>>       mm_warn ("(%s) failed to open serial device", device);
>>
>> +    if (self->priv->fd >= 0) {
>> +        if (!self->priv->iochannel)
>> +            close (self->priv->fd);
>> +        self->priv->fd = -1;
>> +    }
>> +
>>       if (self->priv->iochannel) {
>>           g_io_channel_shutdown (self->priv->iochannel, FALSE, NULL);
>>           g_io_channel_unref (self->priv->iochannel);
>> @@ -1274,11 +1280,6 @@ error:
>>           self->priv->socket = NULL;
>>       }
>>
>> -    if (self->priv->fd >= 0) {
>> -        close (self->priv->fd);
>> -        self->priv->fd = -1;
>> -    }
>> -
>>       return FALSE;
>>   }
>>
>> @@ -1348,6 +1349,13 @@ _close_internal (MMPortSerial *self, gboolean force)
>>               tcflush (self->priv->fd, TCIOFLUSH);
>>           }
>>
>> +        /* Close fd if not part of the iochannel (shutdown closes fd) */
>> +        if (self->priv->fd >= 0) {
>> +            if (!self->priv->iochannel)
>> +                close (self->priv->fd);
>> +            self->priv->fd = -1;
>> +        }
>> +
>>           /* Destroy channel */
>>           if (self->priv->iochannel) {
>>               data_watch_enable (self, FALSE);
>> @@ -1356,12 +1364,6 @@ _close_internal (MMPortSerial *self, gboolean force)
>>               self->priv->iochannel = NULL;
>>           }
>>
>> -        /* Close fd, if any */
>> -        if (self->priv->fd >= 0) {
>> -            close (self->priv->fd);
>> -            self->priv->fd = -1;
>> -        }
>> -
>>           /* Destroy socket */
>>           if (self->priv->socket) {
>>               data_watch_enable (self, FALSE);
> 
> I believe the analysis is good indeed. When we create a GIOChannel
> from a FD, there is nothing in the documentation saying that
> g_io_channel_shutdown() won't close the underlying FD. There is an
> explicit mention to g_io_channel_unref() not implicitly closing it,
> but nothing about shutdown. Following the GLib code, I see how
> g_io_channel_shutdown() will call the "io_close" method in the
> GIOFuncs table, which for the unix channel will run g_io_unix_close(),
> always doing an explicit close().

I didn't find the g_io_channel_shutdown() documentation immediately 
obvious either. The strace tipped me off, and then I followed the code.

> 
> Why not change the logic and instead of not closing when there is no
> iochannel, we just remove the g_io_channel_shutdown() calls? We know
> that g_io_channel_unref() won't do any implicit close, so we can
> safely destroy the iochannel regardless of whether we leave the
> underlying FD open, and then close the FD with close() ourselves
> unconditionally. That would probably be easier to understand as well?
> Instead of the shutdown() call we could place a big fat /* comment */
> explaining why we don't call shutdown. >
> What do you think?
> 

If we directly close() instead of using g_io_channel_shutdown() then
   (1) the channel buffer won't be flushed before closing (which is 
probably fine?), and
   (2) if there are other refs to the channel and the "closing" 
g_io_channel_unref() doesn't destroy it, then the channel's internal 
record keeping will think that the closed fd is still open and 
readable/writable.
I know (2) is not a problem today because there are no other 
g_io_channel_ref() calls, but it could be overlooked if one was added in 
future.

Perhaps a better way would be to only use g_io_channel_shutdown() and 
replace priv->fd with g_io_channel_unix_get_fd() where required for the 
tcgetattr/tcsetattr/ioctl?
Removing priv->fd would mean the fd is tracked in only one place. The 
"self->priv->fd >= 0" checks would become the "self->priv->iochannel" 
check so the fd being open would be tied directly to the iochannel being 
present.

Sam


More information about the ModemManager-devel mailing list