[PATCH libX11] _XDefaultError: set XlibDisplayIOError flag before calling exit

arthur.huillet at free.fr arthur.huillet at free.fr
Wed Feb 1 14:02:41 UTC 2017


From: Arthur Huillet <ahuillet at nvidia.com>

_XReply isn't reentrant, and it can lead to deadlocks when the default error handler is called: _XDefaultError calls exit(1). It is called indirectly by _XReply when a X protocol error comes in that isn't filtered/handled by an extension or the application. This means that if the application (or one of its loaded shared libraries such as the NVIDIA OpenGL driver) has registered any _fini destructor, _fini will get called while still on the call stack of _XReply. If the destructor interacts with the X server and calls _XReply, it will hit a deadlock, looping on the following in _XReply:
    ConditionWait(dpy, dpy->xcb->reply_notify);

It is legal for an application to make Xlib calls during _fini, and that is useful for an OpenGL driver to avoid resource leaks on the X server side, for example in the dlopen/dlclose case. However, the driver can not readily tell whether its _fini is being called because Xlib called exit, or for another reason (dlclose), so it is hard to cleanly work around this issue in the driver.

This change makes it so _XReply effectively becomes a no-op when called after _XDefaultError was called, as though an XIOError had happened. The dpy connection isn't broken at that point, but any call to _XReply is going to hang. This is a bit of a kludge, because the more correct solution would be to make _XReply reentrant, maybe by broadcasting the reply_notify condition before calling the default error handler. However, such a change would carry a grater risk of introducing regressions in Xlib.

This change will drop some valid requests on the floor, but this should not matter, as it will only do so in the case where the application is dying: X will clean up after it once exit() is done running. There is the case of XSetCloseDownMode(RETAIN_PERMANENT), but an application using that and wishing to clean up resources in _fini would currently be hitting a deadlock, which is hardly a better situation.
---

I'm not in love with this change, to be honest, and if anyone has a better suggestion I am eager to hear it. It is the best change I could think of, in terms of making the problem go away without risking regressions.

Thanks
-- 
Arthur Huillet

 src/XlibInt.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/src/XlibInt.c b/src/XlibInt.c
index 9e67dff..1cb386d 100644
--- a/src/XlibInt.c
+++ b/src/XlibInt.c
@@ -1382,6 +1382,16 @@ int _XDefaultError(
 	XErrorEvent *event)
 {
     if (_XPrintDefaultError (dpy, event, stderr) == 0) return 0;
+
+    /*
+     * Store in dpy flags that the client is exiting on an unhandled XError
+     * (pretend it is an IOError, since the application is dying anyway it
+     * does not make a difference).
+     * This is useful for _XReply not to hang if the application makes Xlib
+     * calls in _fini as part of process termination.
+     */
+    dpy->flags |= XlibDisplayIOError;
+
     exit(1);
     /*NOTREACHED*/
 }
-- 
2.10.2



More information about the xorg-devel mailing list