[PATCH] epoll: try to be a _bit_ better about file lifetimes
Christian Brauner
brauner at kernel.org
Sun May 5 10:50:21 UTC 2024
On Sat, May 04, 2024 at 08:40:25AM -0700, Linus Torvalds wrote:
> On Sat, 4 May 2024 at 08:32, Linus Torvalds
> <torvalds at linux-foundation.org> wrote:
> >
> > Now, during this TOTALLY INNOCENT sock_poll(), in another thread, the
> > file closing completes, eventpoll_release() finishes [..]
>
> Actually, Al is right that ep_item_poll() should be holding the
> ep->mtx, so eventpoll_release() -> eventpoll_release_file_file() ->
> mutex_lock(&ep->mtx) should block and the file doesn't actually get
> released.
So I know you've seen it yourself but for my own peace of mind I've said
that in the other mail and in the other thread already that all callers
of ep_item_poll() do already hold the ep->mtx:
do_epoll_ctl()
-> epoll_mutex_lock(&ep->mtx)
-> ep_insert()
-> ep_item_poll()
do_epoll_ctl()
-> epoll_mutex_lock(&ep->mtx)
-> ep_modify()
-> ep_item_poll()
ep_send_events()
-> mutex_lock(&ep->mtx)
-> ep_item_poll()
/* nested; and all callers of ep_item_poll() already hold ep->mtx */
__ep_eventpoll_poll()
-> mutex_lock_nested(&ep->mtx, wait)
-> ep_item_poll()
So it's simply not possible to end up with a UAF in f_op->poll() because
eventpoll_release_file_file() serializes on ep->mtx as well:
__fput()
-> eventpoll_release()
-> eventpoll_release_file()
{
// @file->f_count is zero _but file is not freed_
// so taking file->f_lock is absolutely fine
spin_lock(&file->f_lock);
// mark as dying
// serialzed on ep->mtx
mutex_lock(&ep->mtx);
__ep_rmove(ep, epi);
...
}
-> mutex_lock(&ep->mtx)
-> f_op->release()
-> kfree(file)
So afaict it's simply not possible to end up with a UAF in f_op->poll().
And I agree with you that for some instances it's valid to take another
reference to a file from f_op->poll() but then they need to use
get_file_active() imho and simply handle the case where f_count is zero.
And we need to document that in Documentation/filesystems/file.rst or
locking.rst.
But if it's simply just dma buf that cares about that long-term
reference then really we should just force them to take the reference
like I suggested but don't penalize everyone else. When I took a glance
at all f_op->poll() implementations I didn't spot one that did take
extra references.
But if you absolutely want to have epoll take the reference before it
calls into f_op->poll() that's fine with me as well. But we might end up
forcing epoll to do a lot of final fput()s which I'm not sure is all
that desirable.
More information about the dri-devel
mailing list