[Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

Sat Jun 27 22:13:20 PDT 2015

Pushed. Thanks for the fix and for bearing with me!

On Sun, Jun 28, 2015 at 1:01 AM, Mario Kleiner
<mario.kleiner.de at gmail.com> wrote:
> Yes, that's good.
>
>
> On 06/28/2015 07:00 AM, Ilia Mirkin wrote:
>>
>> How about:
>>
>> /* Use dupfd in hash table, to avoid errors if the original fd gets
>> * closed by its owner. The hash key needs to live at least as long as
>> * the screen.
>> */
>>
>>
>> On Sun, Jun 28, 2015 at 12:57 AM, Mario Kleiner
>> <mario.kleiner.de at gmail.com> wrote:
>>>
>>> On 06/28/2015 06:25 AM, Ilia Mirkin wrote:
>>>>
>>>>
>>>> But that's the thing... in this case, the fd lifetime is owned by the
>>>> X server. In fact, nouveau doesn't touch that fd in mesa beyond
>>>> dup'ing it, and then exclusively using the dup'd fd.
>>>>
>>>
>>> True, it's not owned by nouveau, but instead by mesa core code? I think
>>> the
>>> x-server itself is only involved in authenticating the fd under DRI2,
>>> after
>>> mesa has opened it? Although under DRI3 the ready made fd comes from a
>>> xcb_dri3_open() trip to the server.
>>>
>>> See the open() call on the device file:
>>> dri2CreateScreen() in mesa/src/glx/dri2_glx.c
>>>
>>> xcb_dri3_open():
>>> dri3_create_screen() in mesa/src/glx/dri3_glx.c
>>>
>>> And the close() calls for the fd's are also inside those files.
>>>
>>> Maybe those winsys functions we fixed here are also involved in things
>>> like
>>> EGL for wayland etc.? Haven't checked this.
>>>
>>> Other than that your new commit message is fine. Also i am possibly just
>>> confused about this, and your commit message is an improvement in clarity
>>> anyway :)
>>>
>>> So i think it is fine to leave it as is.
>>>
>>> thanks,
>>> -mario
>>>
>>>
>>>
>>>> On Sun, Jun 28, 2015 at 12:23 AM, Mario Kleiner
>>>> <mario.kleiner.de at gmail.com> wrote:
>>>>>
>>>>>
>>>>> Ok, maybe one thing for the commit message, as i just made the same
>>>>> mixup
>>>>> in
>>>>> my reply: The fd's are not owned by the x-server, but by the mesa
>>>>> screens
>>>>> (pipe screens? dri screens?) they represent, so if such a screen goes
>>>>> away,
>>>>> the fd goes away. Using "X server" would be confusing, especially under
>>>>> dri3, although each such "mesa screen" corresponds to a x-screen.
>>>>>
>>>>> -mario
>>>>>
>>>>>
>>>>> On 06/28/2015 06:03 AM, Mario Kleiner wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 06/28/2015 05:41 AM, Ilia Mirkin wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Oh duh. Thanks for the super-detailed explanation. To rephrase what
>>>>>>> you said in a slightly shorter manner:
>>>>>>>
>>>>>>> See bug https://bugs.freedesktop.org/show_bug.cgi?id=79823 for why we
>>>>>>> need to dupfd (which we are already, although radeon might not be).
>>>>>>>
>>>>>>
>>>>>> Radeon doesn't so far. My 2nd patch for radeon from earlier today adds
>>>>>> that.
>>>>>>
>>>>>>> Except instead of sticking the dup'd fd into the hash table, whose
>>>>>>> lifetime matches that of the device, we stick the other one in, which
>>>>>>> is effectively owned by the X server. As a result, those fstat's
>>>>>>> might
>>>>>>> fail down the line, and all sorts of hell will break loose.
>>>>>>>
>>>>>>
>>>>>> Yes. Essentially we should make sure the fd's we keep around have the
>>>>>> same lifetime as the data structure in which we need to use it. That
>>>>>> was
>>>>>> the point, the server owned fd went away while we still needed it.
>>>>>>
>>>>>>> Would you be opposed to me rewriting your commit message to basically
>>>>>>> reflect the above? Perhaps your original one did as well, but it
>>>>>>> wasn't clear to me. I'd also like to throw some assert's in to make
>>>>>>> sure the fstat's don't fail -- does that sound reasonable?
>>>>>>>
>>>>>>
>>>>>> Oh please, yes! My commit messages often have this disease that they
>>>>>> are
>>>>>> either too terse, when i think the problem is obvious, or too long and
>>>>>> somewhat convoluted when i'm going overboard in the other direction.
>>>>>> Any
>>>>>> editing/shortening for clarity more than happily accepted :)
>>>>>>
>>>>>> thanks,
>>>>>> -mario
>>>>>>
>>>>>>> Another solution, btw, is to stick <stat.st_dev, stat.st_ino,
>>>>>>> stat.st_rdev> as the real key in the hash, although that'd involve
>>>>>>> making pointers. Probably not worth it.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>>      -ilia
>>>>>>>
>>>>>>> On Sat, Jun 27, 2015 at 11:13 PM, Mario Kleiner
>>>>>>> <mario.kleiner.de at gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 06/28/2015 03:48 AM, Ilia Mirkin wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Jun 5, 2015 at 9:36 AM, Mario Kleiner
>>>>>>>>> <mario.kleiner.de at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The dup'ed fd owned by the nouveau_screen for a device node
>>>>>>>>>> must also be used as key for the winsys hash table, instead
>>>>>>>>>> of using the original fd passed in for a screen, to make
>>>>>>>>>> multi-x-screen ZaphodHeads configurations work on nouveau.
>>>>>>>>>>
>>>>>>>>>> This prevents the following crash scenario that was observed
>>>>>>>>>> when a dynamically loaded rendering plugin used OpenGL on a
>>>>>>>>>> ZaphodHeads setup, e.g., on a dual x-screen setup. At first
>>>>>>>>>> load the plugin worked, but after unloading and reloading it,
>>>>>>>>>> the next rendering operation crashed:
>>>>>>>>>>
>>>>>>>>>> 1. Client, e.g., a plugin, calls glXQueryVersion.
>>>>>>>>>>
>>>>>>>>>> 2. DRI screens for the x-screens 0 and 1 are created, one shared
>>>>>>>>>>        nouveau_screen is created for the shared device node of
>>>>>>>>>> both
>>>>>>>>>>        screens, but the original fd of x-screen 0 is used as
>>>>>>>>>> identifying
>>>>>>>>>>        key in the hash table, instead of the dup()ed fd of
>>>>>>>>>> x-screen 0
>>>>>>>>>>        which is owned by the nouveau_screen. nouveau_screen's
>>>>>>>>>> refcount
>>>>>>>>>>        is now 2.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> See below, but it shouldn't matter which fd gets used.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 3. Regular rendering happens by the client plugin, then the plugin
>>>>>>>>>>        gets unloaded.
>>>>>>>>>>
>>>>>>>>>> 4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
>>>>>>>>>>        nouveau_drm_screen_unref() drops the refcount to 1, calling
>>>>>>>>>> mesa
>>>>>>>>>>        code then closes the fd of x-screen 0, so now the fd which
>>>>>>>>>> is
>>>>>>>>>>        used as key in the hash table is invalid. x-screen 1 gets
>>>>>>>>>>        destroyed, nouveau_drm_screen_unref() drops the refcount to
>>>>>>>>>> 0,
>>>>>>>>>>        the nouveau_screen gets destroyed, but removal of its entry
>>>>>>>>>>        in the hash table fails, because the invalid fd in the hash
>>>>>>>>>>        table no longer matches anything (fstat() on the fd is used
>>>>>>>>>>        for hashing and key comparison, but fstat() on an already
>>>>>>>>>> closed
>>>>>>>>>>        fd fails and returns bogus results). x-screen 1 closes its
>>>>>>>>>> fd.
>>>>>>>>>>
>>>>>>>>>>        Now all fd's are closed, the nouveau_screen destroyed, but
>>>>>>>>>>        there is a dangling reference to the nouveau_screen in the
>>>>>>>>>>        hash table.
>>>>>>>>>>
>>>>>>>>>> 5. Some OpenGL client plugin gets loaded again and calls
>>>>>>>>>>        glXQueryVersion. Step 2 above repeats, but because a
>>>>>>>>>>        dangling reference with a matching fd is found in the
>>>>>>>>>> winsys
>>>>>>>>>>        hash table, no new nouveau_screen is created this time.
>>>>>>>>>> Instead
>>>>>>>>>>        the invalid pointer to the old nouveau_screen is recycled,
>>>>>>>>>>        which points to nirvana -> Crash.
>>>>>>>>>>
>>>>>>>>>> This problem is avoided by use of the dup()ed fd which is
>>>>>>>>>> owned by the nouveau_screen and has the same lifetime as
>>>>>>>>>> the nouveau_screen itself.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I need to think about this some more, but... this shouldn't happen
>>>>>>>>> :)
>>>>>>>>>
>>>>>>>>> In fact, the whole dupfd thing was added there for ZaphodHeads
>>>>>>>>> screens
>>>>>>>>> in the first place. See
>>>>>>>>> https://bugs.freedesktop.org/show_bug.cgi?id=79823 and commit
>>>>>>>>> a59f2bb17bcc which fixed it.
>>>>>>>>>
>>>>>>>>> Note that the hash has the following hash/eq functions:
>>>>>>>>>
>>>>>>>>> static unsigned hash_fd(void *key)
>>>>>>>>> {
>>>>>>>>>         int fd = pointer_to_intptr(key);
>>>>>>>>>         struct stat stat;
>>>>>>>>>         fstat(fd, &stat);
>>>>>>>>>
>>>>>>>>>         return stat.st_dev ^ stat.st_ino ^ stat.st_rdev;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> static int compare_fd(void *key1, void *key2)
>>>>>>>>> {
>>>>>>>>>         int fd1 = pointer_to_intptr(key1);
>>>>>>>>>         int fd2 = pointer_to_intptr(key2);
>>>>>>>>>         struct stat stat1, stat2;
>>>>>>>>>         fstat(fd1, &stat1);
>>>>>>>>>         fstat(fd2, &stat2);
>>>>>>>>>
>>>>>>>>>         return stat1.st_dev != stat2.st_dev ||
>>>>>>>>>                stat1.st_ino != stat2.st_ino ||
>>>>>>>>>                stat1.st_rdev != stat2.st_rdev;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> so fd and dupfd should get hashed to the same thing. I suspect
>>>>>>>>> there's
>>>>>>>>> something else going on in your application...
>>>>>>>>>
>>>>>>>>
>>>>>>>> My application is a set of dynamically loaded plugins running inside
>>>>>>>> another
>>>>>>>> host app (Psychtoolbox-3 inside GNU/Octave), so what happens often
>>>>>>>> is
>>>>>>>> that
>>>>>>>> the OpenGL using plugin gets unloaded and reloaded at runtime, so a
>>>>>>>> after a
>>>>>>>> OpenGL session has ended with a XCloseDisplay() tearing the winsys
>>>>>>>> down, you
>>>>>>>> can easily have it restart at plugin reload with another
>>>>>>>> XOpenDisplay
>>>>>>>> ->
>>>>>>>> glXQueryVersion -> .... sequence, where the new call to
>>>>>>>> glXQueryVersion will
>>>>>>>> trigger a recreation of the winsys, which will find the stale entry
>>>>>>>> in the
>>>>>>>> hash table pointing to nowhere instead of the previously released
>>>>>>>> nouveau_screen -> use-after-free -> boom!
>>>>>>>>
>>>>>>>> The reason this fails is because during destruction of the 2nd, 3rd
>>>>>>>> etc.
>>>>>>>> x-screen, the already closed fd associated with the 1st x-screen is
>>>>>>>> fed into
>>>>>>>> the compare_fd function, so fstat() errors out on the invalid fd. I
>>>>>>>> added
>>>>>>>> printf's etc. on both nouveau and now radeon to verify the fstat
>>>>>>>> gives me
>>>>>>>> EBADF errors. So the hash calculation goes wrong when trying to find
>>>>>>>> the
>>>>>>>> matching element in the hash table with a fd that has a matching
>>>>>>>> hash
>>>>>>>> -> The
>>>>>>>> element which should be removed is ignored/not removed because it
>>>>>>>> contains
>>>>>>>> an already closed fd for which no proper hash can be calculated
>>>>>>>> anymore ->
>>>>>>>> hash comparison during search goes wrong.
>>>>>>>>
>>>>>>>> This is because multiple x-screens, e.g., 2 x-screens are destroyed
>>>>>>>> in order
>>>>>>>> 0, 1 by FreeScreenConfigs() as part of XCloseDisplay().
>>>>>>>> FreeScreenConfigs()
>>>>>>>> calls dri2DestroyScreen() in dri2.c or dri3_destroy_screen in
>>>>>>>> dri3.c,
>>>>>>>> which
>>>>>>>> are essentially identical, e.g.,:
>>>>>>>>
>>>>>>>> static void
>>>>>>>> dri2DestroyScreen(struct glx_screen *base)
>>>>>>>> {
>>>>>>>>       struct dri2_screen *psc = (struct dri2_screen *) base;
>>>>>>>>
>>>>>>>>       /* Free the direct rendering per screen data */
>>>>>>>>       (*psc->core->destroyScreen) (psc->driScreen);
>>>>>>>>
>>>>>>>> --> This core->destroyScreen calls eventually into the winsys, which
>>>>>>>> then
>>>>>>>> drops the refcount of the nouveau_screen or radeon winsys structure
>>>>>>>> by one,
>>>>>>>> and tries to release the struct and remove the hash table entry once
>>>>>>>> the
>>>>>>>> refcount drops to zero.
>>>>>>>>
>>>>>>>>       driDestroyConfigs(psc->driver_configs);
>>>>>>>>       close(psc->fd);
>>>>>>>>
>>>>>>>> -> This close() closes the fd of a screen immediately after closing
>>>>>>>> down
>>>>>>>> that screen. At the time screen 1 is closed down, the fd associated
>>>>>>>> with
>>>>>>>> screen 0 is already closed.
>>>>>>>>
>>>>>>>>       free(psc);
>>>>>>>> }
>>>>>>>>
>>>>>>>> So you have this sequence during creation:
>>>>>>>>
>>>>>>>> glXQueryVersion() -> AllocAndFetchScreenConfigs():
>>>>>>>>
>>>>>>>> create screen 0 and its fd, e.g., fd==5, dup(fd), e.g., fd==6 for
>>>>>>>> the
>>>>>>>> nouveau_screen, but store x-screen 0's fd==5 *itself* in the hash
>>>>>>>> table as
>>>>>>>> key.
>>>>>>>>
>>>>>>>> create screen 1. This now creates an fd 7
>>>>>>>>
>>>>>>>> => refcount is now 2.
>>>>>>>>
>>>>>>>>
>>>>>>>> And this sequence during destruction at XCloseDisplay();
>>>>>>>>
>>>>>>>> FreeScreenConfigs():
>>>>>>>>
>>>>>>>> free screen 0: core->destroyScreen() -> ... ->
>>>>>>>> nouveau_drm_screen_unref() =>
>>>>>>>> refcount-- is now 1
>>>>>>>> close(fd==5) of screen 0 and thereby the fd==5 stored inside the
>>>>>>>> hash
>>>>>>>> table
>>>>>>>> as key. From now on using fcntl(fd==5) for hash calculation on that
>>>>>>>> element
>>>>>>>> will malfunction.
>>>>>>>>
>>>>>>>> free screen 1: -> nouveau_drm_screen_unref -> refcount-- is now 0.
>>>>>>>> ->
>>>>>>>> Try to
>>>>>>>> remove hash table entry => fails because fcntl(fd==5) on the already
>>>>>>>> closed
>>>>>>>> fd==5 of screen 0 fails with EBADF. => Hash table keeps its stale
>>>>>>>> element
>>>>>>>> referencing the struct nouveau_screen.
>>>>>>>>
>>>>>>>> Free nouveau_screen struct (close dup()ed fd==6 for nouveau_screen,
>>>>>>>> close
>>>>>>>> down stuff, free the struct).
>>>>>>>> -> close(fd==7) associated with screen 1.
>>>>>>>>
>>>>>>>> Now all fd's (5,6,7) are closed, the struct is gone, the hash table
>>>>>>>> contains
>>>>>>>> a stale element with a no longer existent fd==5 and a pointer to an
>>>>>>>> already
>>>>>>>> free()d struct nouveau_screen.
>>>>>>>>
>>>>>>>> Now we reload the plugin and do glXQueryVersion() again, so
>>>>>>>> AllocAndFetchScreenConfigs() is executed again to create the winsys
>>>>>>>> etc. for
>>>>>>>> screens 0 and 1.
>>>>>>>>
>>>>>>>> create screen 0 and its fd: Here the first unused fd in the
>>>>>>>> filedescriptor
>>>>>>>> table is recycled, which happens to be fd==5, just as in the first
>>>>>>>> session
>>>>>>>> with the plugin! Now suddenly fd=5 is again the valid fd for
>>>>>>>> x-screen
>>>>>>>> 0, and
>>>>>>>> it points to the same /dev/drm/card0 device file as before. That
>>>>>>>> means when
>>>>>>>> nouveau_drm_screen_create(fd==5) is called, it finds the stale
>>>>>>>> element in
>>>>>>>> the hash table from the previous session, because the previously
>>>>>>>> closed fd 5
>>>>>>>> in that element is now valid and open again and points to the same
>>>>>>>> device
>>>>>>>> file as in the previous session, ergo has the same hash etc.
>>>>>>>> nouveau_drm_screen_create() concludes that a fully initialized
>>>>>>>> nouveau_screen already exists for this session, except it doesn't
>>>>>>>> exist -
>>>>>>>> the pointer in the hash table points to invalid previously freed
>>>>>>>> memory.
>>>>>>>> When it tries to access that memory we have a use-after-free and the
>>>>>>>> application segfaults.
>>>>>>>>
>>>>>>>> So it depends a bit on what happens in the applications memory
>>>>>>>> between the
>>>>>>>> two consecutive OpenGL sessions etc. how long it takes for an
>>>>>>>> invalid
>>>>>>>> memory
>>>>>>>> access to happen, but eventually it hits invalid data. Maybe
>>>>>>>> occassionally
>>>>>>>> it gets lucky and the freed memory is still accessible and the
>>>>>>>> struct
>>>>>>>> intact, so stuff works by chance.
>>>>>>>>
>>>>>>>> If the application happened to open other files inbetween the 1st
>>>>>>>> and
>>>>>>>> 2nd
>>>>>>>> session, then the relevant fd in the 2nd session may not be exactly
>>>>>>>> the same
>>>>>>>> as in the first session, because recycling that fd didn't work, so
>>>>>>>> we
>>>>>>>> are
>>>>>>>> only left with a dead but harmless entry in the hash table instead
>>>>>>>> of
>>>>>>>> a
>>>>>>>> crasher. I assume something like that prevented my crashes on radeon
>>>>>>>> in the
>>>>>>>> past, because when i wasn't expecting/looking for trouble i didn't
>>>>>>>> use
>>>>>>>> a
>>>>>>>> test sequence carefully made to trigger the bug for certain.
>>>>>>>>
>>>>>>>> Similar things happen on radeon for the same reason, ergo the 2nd
>>>>>>>> patch with
>>>>>>>> a similar fix.
>>>>>>>>
>>>>>>>> I don't have the gdb backtrace for nouveau anymore, but the traces
>>>>>>>> for
>>>>>>>> radeon are attached to this mail to show you the flow of execution,
>>>>>>>> similar
>>>>>>>> to nouveau.
>>>>>>>>
>>>>>>>> -mario