[Mesa-dev] [PATCH] nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

Mario Kleiner mario.kleiner.de at gmail.com
Sat Jun 27 21:23:51 PDT 2015

Ok, maybe one thing for the commit message, as i just made the same 
mixup in my reply: The fd's are not owned by the x-server, but by the 
mesa screens (pipe screens? dri screens?) they represent, so if such a 
screen goes away, the fd goes away. Using "X server" would be confusing, 
especially under dri3, although each such "mesa screen" corresponds to a 


On 06/28/2015 06:03 AM, Mario Kleiner wrote:
> On 06/28/2015 05:41 AM, Ilia Mirkin wrote:
>> Oh duh. Thanks for the super-detailed explanation. To rephrase what
>> you said in a slightly shorter manner:
>> See bug https://bugs.freedesktop.org/show_bug.cgi?id=79823 for why we
>> need to dupfd (which we are already, although radeon might not be).
> Radeon doesn't so far. My 2nd patch for radeon from earlier today adds
> that.
>> Except instead of sticking the dup'd fd into the hash table, whose
>> lifetime matches that of the device, we stick the other one in, which
>> is effectively owned by the X server. As a result, those fstat's might
>> fail down the line, and all sorts of hell will break loose.
> Yes. Essentially we should make sure the fd's we keep around have the
> same lifetime as the data structure in which we need to use it. That was
> the point, the server owned fd went away while we still needed it.
>> Would you be opposed to me rewriting your commit message to basically
>> reflect the above? Perhaps your original one did as well, but it
>> wasn't clear to me. I'd also like to throw some assert's in to make
>> sure the fstat's don't fail -- does that sound reasonable?
> Oh please, yes! My commit messages often have this disease that they are
> either too terse, when i think the problem is obvious, or too long and
> somewhat convoluted when i'm going overboard in the other direction. Any
> editing/shortening for clarity more than happily accepted :)
> thanks,
> -mario
>> Another solution, btw, is to stick <stat.st_dev, stat.st_ino,
>> stat.st_rdev> as the real key in the hash, although that'd involve
>> making pointers. Probably not worth it.
>> Cheers,
>>    -ilia
>> On Sat, Jun 27, 2015 at 11:13 PM, Mario Kleiner
>> <mario.kleiner.de at gmail.com> wrote:
>>> On 06/28/2015 03:48 AM, Ilia Mirkin wrote:
>>>> On Fri, Jun 5, 2015 at 9:36 AM, Mario Kleiner
>>>> <mario.kleiner.de at gmail.com> wrote:
>>>>> The dup'ed fd owned by the nouveau_screen for a device node
>>>>> must also be used as key for the winsys hash table, instead
>>>>> of using the original fd passed in for a screen, to make
>>>>> multi-x-screen ZaphodHeads configurations work on nouveau.
>>>>> This prevents the following crash scenario that was observed
>>>>> when a dynamically loaded rendering plugin used OpenGL on a
>>>>> ZaphodHeads setup, e.g., on a dual x-screen setup. At first
>>>>> load the plugin worked, but after unloading and reloading it,
>>>>> the next rendering operation crashed:
>>>>> 1. Client, e.g., a plugin, calls glXQueryVersion.
>>>>> 2. DRI screens for the x-screens 0 and 1 are created, one shared
>>>>>      nouveau_screen is created for the shared device node of both
>>>>>      screens, but the original fd of x-screen 0 is used as identifying
>>>>>      key in the hash table, instead of the dup()ed fd of x-screen 0
>>>>>      which is owned by the nouveau_screen. nouveau_screen's refcount
>>>>>      is now 2.
>>>> See below, but it shouldn't matter which fd gets used.
>>>>> 3. Regular rendering happens by the client plugin, then the plugin
>>>>>      gets unloaded.
>>>>> 4. XCloseDisplay(). x-screen 0 gets its DRI screen destroyed,
>>>>>      nouveau_drm_screen_unref() drops the refcount to 1, calling mesa
>>>>>      code then closes the fd of x-screen 0, so now the fd which is
>>>>>      used as key in the hash table is invalid. x-screen 1 gets
>>>>>      destroyed, nouveau_drm_screen_unref() drops the refcount to 0,
>>>>>      the nouveau_screen gets destroyed, but removal of its entry
>>>>>      in the hash table fails, because the invalid fd in the hash
>>>>>      table no longer matches anything (fstat() on the fd is used
>>>>>      for hashing and key comparison, but fstat() on an already closed
>>>>>      fd fails and returns bogus results). x-screen 1 closes its fd.
>>>>>      Now all fd's are closed, the nouveau_screen destroyed, but
>>>>>      there is a dangling reference to the nouveau_screen in the
>>>>>      hash table.
>>>>> 5. Some OpenGL client plugin gets loaded again and calls
>>>>>      glXQueryVersion. Step 2 above repeats, but because a
>>>>>      dangling reference with a matching fd is found in the winsys
>>>>>      hash table, no new nouveau_screen is created this time. Instead
>>>>>      the invalid pointer to the old nouveau_screen is recycled,
>>>>>      which points to nirvana -> Crash.
>>>>> This problem is avoided by use of the dup()ed fd which is
>>>>> owned by the nouveau_screen and has the same lifetime as
>>>>> the nouveau_screen itself.
>>>> I need to think about this some more, but... this shouldn't happen :)
>>>> In fact, the whole dupfd thing was added there for ZaphodHeads screens
>>>> in the first place. See
>>>> https://bugs.freedesktop.org/show_bug.cgi?id=79823 and commit
>>>> a59f2bb17bcc which fixed it.
>>>> Note that the hash has the following hash/eq functions:
>>>> static unsigned hash_fd(void *key)
>>>> {
>>>>       int fd = pointer_to_intptr(key);
>>>>       struct stat stat;
>>>>       fstat(fd, &stat);
>>>>       return stat.st_dev ^ stat.st_ino ^ stat.st_rdev;
>>>> }
>>>> static int compare_fd(void *key1, void *key2)
>>>> {
>>>>       int fd1 = pointer_to_intptr(key1);
>>>>       int fd2 = pointer_to_intptr(key2);
>>>>       struct stat stat1, stat2;
>>>>       fstat(fd1, &stat1);
>>>>       fstat(fd2, &stat2);
>>>>       return stat1.st_dev != stat2.st_dev ||
>>>>              stat1.st_ino != stat2.st_ino ||
>>>>              stat1.st_rdev != stat2.st_rdev;
>>>> }
>>>> so fd and dupfd should get hashed to the same thing. I suspect there's
>>>> something else going on in your application...
>>> My application is a set of dynamically loaded plugins running inside
>>> another
>>> host app (Psychtoolbox-3 inside GNU/Octave), so what happens often is
>>> that
>>> the OpenGL using plugin gets unloaded and reloaded at runtime, so a
>>> after a
>>> OpenGL session has ended with a XCloseDisplay() tearing the winsys
>>> down, you
>>> can easily have it restart at plugin reload with another XOpenDisplay ->
>>> glXQueryVersion -> .... sequence, where the new call to
>>> glXQueryVersion will
>>> trigger a recreation of the winsys, which will find the stale entry
>>> in the
>>> hash table pointing to nowhere instead of the previously released
>>> nouveau_screen -> use-after-free -> boom!
>>> The reason this fails is because during destruction of the 2nd, 3rd etc.
>>> x-screen, the already closed fd associated with the 1st x-screen is
>>> fed into
>>> the compare_fd function, so fstat() errors out on the invalid fd. I
>>> added
>>> printf's etc. on both nouveau and now radeon to verify the fstat
>>> gives me
>>> EBADF errors. So the hash calculation goes wrong when trying to find the
>>> matching element in the hash table with a fd that has a matching hash
>>> -> The
>>> element which should be removed is ignored/not removed because it
>>> contains
>>> an already closed fd for which no proper hash can be calculated
>>> anymore ->
>>> hash comparison during search goes wrong.
>>> This is because multiple x-screens, e.g., 2 x-screens are destroyed
>>> in order
>>> 0, 1 by FreeScreenConfigs() as part of XCloseDisplay().
>>> FreeScreenConfigs()
>>> calls dri2DestroyScreen() in dri2.c or dri3_destroy_screen in dri3.c,
>>> which
>>> are essentially identical, e.g.,:
>>> static void
>>> dri2DestroyScreen(struct glx_screen *base)
>>> {
>>>     struct dri2_screen *psc = (struct dri2_screen *) base;
>>>     /* Free the direct rendering per screen data */
>>>     (*psc->core->destroyScreen) (psc->driScreen);
>>> --> This core->destroyScreen calls eventually into the winsys, which
>>> then
>>> drops the refcount of the nouveau_screen or radeon winsys structure
>>> by one,
>>> and tries to release the struct and remove the hash table entry once the
>>> refcount drops to zero.
>>>     driDestroyConfigs(psc->driver_configs);
>>>     close(psc->fd);
>>> -> This close() closes the fd of a screen immediately after closing down
>>> that screen. At the time screen 1 is closed down, the fd associated with
>>> screen 0 is already closed.
>>>     free(psc);
>>> }
>>> So you have this sequence during creation:
>>> glXQueryVersion() -> AllocAndFetchScreenConfigs():
>>> create screen 0 and its fd, e.g., fd==5, dup(fd), e.g., fd==6 for the
>>> nouveau_screen, but store x-screen 0's fd==5 *itself* in the hash
>>> table as
>>> key.
>>> create screen 1. This now creates an fd 7
>>> => refcount is now 2.
>>> And this sequence during destruction at XCloseDisplay();
>>> FreeScreenConfigs():
>>> free screen 0: core->destroyScreen() -> ... ->
>>> nouveau_drm_screen_unref() =>
>>> refcount-- is now 1
>>> close(fd==5) of screen 0 and thereby the fd==5 stored inside the hash
>>> table
>>> as key. From now on using fcntl(fd==5) for hash calculation on that
>>> element
>>> will malfunction.
>>> free screen 1: -> nouveau_drm_screen_unref -> refcount-- is now 0. ->
>>> Try to
>>> remove hash table entry => fails because fcntl(fd==5) on the already
>>> closed
>>> fd==5 of screen 0 fails with EBADF. => Hash table keeps its stale
>>> element
>>> referencing the struct nouveau_screen.
>>> Free nouveau_screen struct (close dup()ed fd==6 for nouveau_screen,
>>> close
>>> down stuff, free the struct).
>>> -> close(fd==7) associated with screen 1.
>>> Now all fd's (5,6,7) are closed, the struct is gone, the hash table
>>> contains
>>> a stale element with a no longer existent fd==5 and a pointer to an
>>> already
>>> free()d struct nouveau_screen.
>>> Now we reload the plugin and do glXQueryVersion() again, so
>>> AllocAndFetchScreenConfigs() is executed again to create the winsys
>>> etc. for
>>> screens 0 and 1.
>>> create screen 0 and its fd: Here the first unused fd in the
>>> filedescriptor
>>> table is recycled, which happens to be fd==5, just as in the first
>>> session
>>> with the plugin! Now suddenly fd=5 is again the valid fd for x-screen
>>> 0, and
>>> it points to the same /dev/drm/card0 device file as before. That
>>> means when
>>> nouveau_drm_screen_create(fd==5) is called, it finds the stale
>>> element in
>>> the hash table from the previous session, because the previously
>>> closed fd 5
>>> in that element is now valid and open again and points to the same
>>> device
>>> file as in the previous session, ergo has the same hash etc.
>>> nouveau_drm_screen_create() concludes that a fully initialized
>>> nouveau_screen already exists for this session, except it doesn't
>>> exist -
>>> the pointer in the hash table points to invalid previously freed memory.
>>> When it tries to access that memory we have a use-after-free and the
>>> application segfaults.
>>> So it depends a bit on what happens in the applications memory
>>> between the
>>> two consecutive OpenGL sessions etc. how long it takes for an invalid
>>> memory
>>> access to happen, but eventually it hits invalid data. Maybe
>>> occassionally
>>> it gets lucky and the freed memory is still accessible and the struct
>>> intact, so stuff works by chance.
>>> If the application happened to open other files inbetween the 1st and
>>> 2nd
>>> session, then the relevant fd in the 2nd session may not be exactly
>>> the same
>>> as in the first session, because recycling that fd didn't work, so we
>>> are
>>> only left with a dead but harmless entry in the hash table instead of a
>>> crasher. I assume something like that prevented my crashes on radeon
>>> in the
>>> past, because when i wasn't expecting/looking for trouble i didn't use a
>>> test sequence carefully made to trigger the bug for certain.
>>> Similar things happen on radeon for the same reason, ergo the 2nd
>>> patch with
>>> a similar fix.
>>> I don't have the gdb backtrace for nouveau anymore, but the traces for
>>> radeon are attached to this mail to show you the flow of execution,
>>> similar
>>> to nouveau.
>>> -mario

More information about the mesa-dev mailing list