[Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

Sat Jun 27 20:41:10 PDT 2015

Oh duh. Thanks for the super-detailed explanation. To rephrase what
you said in a slightly shorter manner:

See bug https://bugs.freedesktop.org/show_bug.cgi?id=79823 for why we
need to dupfd (which we are already, although radeon might not be).

Except instead of sticking the dup'd fd into the hash table, whose
lifetime matches that of the device, we stick the other one in, which
is effectively owned by the X server. As a result, those fstat's might
fail down the line, and all sorts of hell will break loose.

Would you be opposed to me rewriting your commit message to basically
reflect the above? Perhaps your original one did as well, but it
wasn't clear to me. I'd also like to throw some assert's in to make
sure the fstat's don't fail -- does that sound reasonable?

Another solution, btw, is to stick <stat.st_dev, stat.st_ino,
stat.st_rdev> as the real key in the hash, although that'd involve
making pointers. Probably not worth it.

Cheers,

  -ilia

On Sat, Jun 27, 2015 at 11:13 PM, Mario Kleiner
<mario.kleiner.de at gmail.com> wrote:
> On 06/28/2015 03:48 AM, Ilia Mirkin wrote:
>>
>> On Fri, Jun 5, 2015 at 9:36 AM, Mario Kleiner
>> <mario.kleiner.de at gmail.com> wrote:
>>>
>>> The dup'ed fd owned by the nouveau_screen for a device node
>>> must also be used as key for the winsys hash table, instead
>>> of using the original fd passed in for a screen, to make
>>> multi-x-screen ZaphodHeads configurations work on nouveau.
>>>
>>> This prevents the following crash scenario that was observed
>>> when a dynamically loaded rendering plugin used OpenGL on a
>>> ZaphodHeads setup, e.g., on a dual x-screen setup. At first
>>> load the plugin worked, but after unloading and reloading it,
>>> the next rendering operation crashed:
>>>
>>> 1. Client, e.g., a plugin, calls glXQueryVersion.
>>>
>>> 2. DRI screens for the x-screens 0 and 1 are created, one shared
>>>     nouveau_screen is created for the shared device node of both
>>>     screens, but the original fd of x-screen 0 is used as identifying
>>>     key in the hash table, instead of the dup()ed fd of x-screen 0
>>>     which is owned by the nouveau_screen. nouveau_screen's refcount
>>>     is now 2.
>>
>>
>> See below, but it shouldn't matter which fd gets used.
>>
>>>
>>> 3. Regular rendering happens by the client plugin, then the plugin
>>>     gets unloaded.
>>>
>>> 4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
>>>     nouveau_drm_screen_unref() drops the refcount to 1, calling mesa
>>>     code then closes the fd of x-screen 0, so now the fd which is
>>>     used as key in the hash table is invalid. x-screen 1 gets
>>>     destroyed, nouveau_drm_screen_unref() drops the refcount to 0,
>>>     the nouveau_screen gets destroyed, but removal of its entry
>>>     in the hash table fails, because the invalid fd in the hash
>>>     table no longer matches anything (fstat() on the fd is used
>>>     for hashing and key comparison, but fstat() on an already closed
>>>     fd fails and returns bogus results). x-screen 1 closes its fd.
>>>
>>>     Now all fd's are closed, the nouveau_screen destroyed, but
>>>     there is a dangling reference to the nouveau_screen in the
>>>     hash table.
>>>
>>> 5. Some OpenGL client plugin gets loaded again and calls
>>>     glXQueryVersion. Step 2 above repeats, but because a
>>>     dangling reference with a matching fd is found in the winsys
>>>     hash table, no new nouveau_screen is created this time. Instead
>>>     the invalid pointer to the old nouveau_screen is recycled,
>>>     which points to nirvana -> Crash.
>>>
>>> This problem is avoided by use of the dup()ed fd which is
>>> owned by the nouveau_screen and has the same lifetime as
>>> the nouveau_screen itself.
>>
>>
>> I need to think about this some more, but... this shouldn't happen :)
>>
>> In fact, the whole dupfd thing was added there for ZaphodHeads screens
>> in the first place. See
>> https://bugs.freedesktop.org/show_bug.cgi?id=79823 and commit
>> a59f2bb17bcc which fixed it.
>>
>> Note that the hash has the following hash/eq functions:
>>
>> static unsigned hash_fd(void *key)
>> {
>>      int fd = pointer_to_intptr(key);
>>      struct stat stat;
>>      fstat(fd, &stat);
>>
>>      return stat.st_dev ^ stat.st_ino ^ stat.st_rdev;
>> }
>>
>> static int compare_fd(void *key1, void *key2)
>> {
>>      int fd1 = pointer_to_intptr(key1);
>>      int fd2 = pointer_to_intptr(key2);
>>      struct stat stat1, stat2;
>>      fstat(fd1, &stat1);
>>      fstat(fd2, &stat2);
>>
>>      return stat1.st_dev != stat2.st_dev ||
>>             stat1.st_ino != stat2.st_ino ||
>>             stat1.st_rdev != stat2.st_rdev;
>> }
>>
>> so fd and dupfd should get hashed to the same thing. I suspect there's
>> something else going on in your application...
>>
>
> My application is a set of dynamically loaded plugins running inside another
> host app (Psychtoolbox-3 inside GNU/Octave), so what happens often is that
> the OpenGL using plugin gets unloaded and reloaded at runtime, so a after a
> OpenGL session has ended with a XCloseDisplay() tearing the winsys down, you
> can easily have it restart at plugin reload with another XOpenDisplay ->
> glXQueryVersion -> .... sequence, where the new call to glXQueryVersion will
> trigger a recreation of the winsys, which will find the stale entry in the
> hash table pointing to nowhere instead of the previously released
> nouveau_screen -> use-after-free -> boom!
>
> The reason this fails is because during destruction of the 2nd, 3rd etc.
> x-screen, the already closed fd associated with the 1st x-screen is fed into
> the compare_fd function, so fstat() errors out on the invalid fd. I added
> printf's etc. on both nouveau and now radeon to verify the fstat gives me
> EBADF errors. So the hash calculation goes wrong when trying to find the
> matching element in the hash table with a fd that has a matching hash -> The
> element which should be removed is ignored/not removed because it contains
> an already closed fd for which no proper hash can be calculated anymore ->
> hash comparison during search goes wrong.
>
> This is because multiple x-screens, e.g., 2 x-screens are destroyed in order
> 0, 1 by FreeScreenConfigs() as part of XCloseDisplay(). FreeScreenConfigs()
> calls dri2DestroyScreen() in dri2.c or dri3_destroy_screen in dri3.c, which
> are essentially identical, e.g.,:
>
> static void
> dri2DestroyScreen(struct glx_screen *base)
> {
>    struct dri2_screen *psc = (struct dri2_screen *) base;
>
>    /* Free the direct rendering per screen data */
>    (*psc->core->destroyScreen) (psc->driScreen);
>
> --> This core->destroyScreen calls eventually into the winsys, which then
> drops the refcount of the nouveau_screen or radeon winsys structure by one,
> and tries to release the struct and remove the hash table entry once the
> refcount drops to zero.
>
>    driDestroyConfigs(psc->driver_configs);
>    close(psc->fd);
>
> -> This close() closes the fd of a screen immediately after closing down
> that screen. At the time screen 1 is closed down, the fd associated with
> screen 0 is already closed.
>
>    free(psc);
> }
>
> So you have this sequence during creation:
>
> glXQueryVersion() -> AllocAndFetchScreenConfigs():
>
> create screen 0 and its fd, e.g., fd==5, dup(fd), e.g., fd==6 for the
> nouveau_screen, but store x-screen 0's fd==5 *itself* in the hash table as
> key.
>
> create screen 1. This now creates an fd 7
>
> => refcount is now 2.
>
>
> And this sequence during destruction at XCloseDisplay();
>
> FreeScreenConfigs():
>
> free screen 0: core->destroyScreen() -> ... -> nouveau_drm_screen_unref() =>
> refcount-- is now 1
> close(fd==5) of screen 0 and thereby the fd==5 stored inside the hash table
> as key. From now on using fcntl(fd==5) for hash calculation on that element
> will malfunction.
>
> free screen 1: -> nouveau_drm_screen_unref -> refcount-- is now 0. -> Try to
> remove hash table entry => fails because fcntl(fd==5) on the already closed
> fd==5 of screen 0 fails with EBADF. => Hash table keeps its stale element
> referencing the struct nouveau_screen.
>
> Free nouveau_screen struct (close dup()ed fd==6 for nouveau_screen, close
> down stuff, free the struct).
> -> close(fd==7) associated with screen 1.
>
> Now all fd's (5,6,7) are closed, the struct is gone, the hash table contains
> a stale element with a no longer existent fd==5 and a pointer to an already
> free()d struct nouveau_screen.
>
> Now we reload the plugin and do glXQueryVersion() again, so
> AllocAndFetchScreenConfigs() is executed again to create the winsys etc. for
> screens 0 and 1.
>
> create screen 0 and its fd: Here the first unused fd in the filedescriptor
> table is recycled, which happens to be fd==5, just as in the first session
> with the plugin! Now suddenly fd=5 is again the valid fd for x-screen 0, and
> it points to the same /dev/drm/card0 device file as before. That means when
> nouveau_drm_screen_create(fd==5) is called, it finds the stale element in
> the hash table from the previous session, because the previously closed fd 5
> in that element is now valid and open again and points to the same device
> file as in the previous session, ergo has the same hash etc.
> nouveau_drm_screen_create() concludes that a fully initialized
> nouveau_screen already exists for this session, except it doesn't exist -
> the pointer in the hash table points to invalid previously freed memory.
> When it tries to access that memory we have a use-after-free and the
> application segfaults.
>
> So it depends a bit on what happens in the applications memory between the
> two consecutive OpenGL sessions etc. how long it takes for an invalid memory
> access to happen, but eventually it hits invalid data. Maybe occassionally
> it gets lucky and the freed memory is still accessible and the struct
> intact, so stuff works by chance.
>
> If the application happened to open other files inbetween the 1st and 2nd
> session, then the relevant fd in the 2nd session may not be exactly the same
> as in the first session, because recycling that fd didn't work, so we are
> only left with a dead but harmless entry in the hash table instead of a
> crasher. I assume something like that prevented my crashes on radeon in the
> past, because when i wasn't expecting/looking for trouble i didn't use a
> test sequence carefully made to trigger the bug for certain.
>
> Similar things happen on radeon for the same reason, ergo the 2nd patch with
> a similar fix.
>
> I don't have the gdb backtrace for nouveau anymore, but the traces for
> radeon are attached to this mail to show you the flow of execution, similar
> to nouveau.
>
> -mario