[PATCH] dmabuf: fix dmabuf file poll uaf issue

Christian König christian.koenig at amd.com
Thu Apr 18 06:46:09 UTC 2024


Am 18.04.24 um 03:33 schrieb zhiguojiang:
> 在 2024/4/15 19:57, Christian König 写道:
>> [Some people who received this message don't often get email from 
>> christian.koenig at amd.com. Learn why this is important at 
>> https://aka.ms/LearnAboutSenderIdentification ]
>>
>> Am 15.04.24 um 12:35 schrieb zhiguojiang:
>>> 在 2024/4/12 14:39, Christian König 写道:
>>>> [Some people who received this message don't often get email from
>>>> christian.koenig at amd.com. Learn why this is important at
>>>> https://aka.ms/LearnAboutSenderIdentification ]
>>>>
>>>> Am 12.04.24 um 08:19 schrieb zhiguojiang:
>>>>> [SNIP]
>>>>> -> Here task 2220 do epoll again where internally it will get/put 
>>>>> then
>>>>> start to free twice and lead to final crash.
>>>>>
>>>>> Here is the basic flow:
>>>>>
>>>>> 1. Thread A install the dma_buf_fd via dma_buf_export, the fd 
>>>>> refcount
>>>>> is 1
>>>>>
>>>>> 2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)
>>>>>
>>>>> 3. After use the dma buf, Thread A close the fd, then here fd 
>>>>> refcount
>>>>> is 0,
>>>>>   and it will run __fput finally to release the file
>>>>
>>>> Stop, that isn't correct.
>>>>
>>>> The fs layer which calls dma_buf_poll() should make sure that the file
>>>> can't go away until the function returns.
>>>>
>>>> Then inside dma_buf_poll() we add another reference to the file while
>>>> installing the callback:
>>>>
>>>>                         /* Paired with fput in dma_buf_poll_cb */
>>>>                         get_file(dmabuf->file);
>>> Hi,
>>>
>>> The problem may just occurred here.
>>>
>>> Is it possible file reference count already decreased to 0 and fput
>>> already being in progressing just before calling
>>> get_file(dmabuf->file) in dma_buf_poll()?
>>
>> No, exactly that isn't possible.
>>
>> If a function gets a dma_buf pointer or even more general any reference
>> counted pointer which has already decreased to 0 then that is a major
>> bug in the caller of that function.
>>
>> BTW: It's completely illegal to work around such issues by using
>> file_count() or RCU functions. So when you suggest stuff like that it
>> will immediately face rejection.
>>
>> Regards,
>> Christian.
> Hi,
>
> Thanks for your kind comment.
>
> 'If a function gets a dma_buf pointer or even more general any reference
>
> counted pointer which has already decreased to 0 then that is a major
>
> bug in the caller of that function.'
>
> According to the issue log, we can see the dma buf file close and 
> epoll operation running in parallel.
>
> Apparently at the moment caller calls epoll, although another task 
> caller already called close on the same fd, but this fd was still in 
> progress to close, so fd is still valid, thus no EBADF returned to 
> caller.

No, exactly that can't happen either.

As far as I can see the EPOLL holds a reference to the files it 
contains. So it is perfectly valid to add the file descriptor to EPOLL 
and then close it.

In this case the file won't be closed, but be kept alive by it's 
reference in the EPOLL file descriptor.

>
> Then the two task contexts operate on same dma buf fd(one is close, 
> another is epoll) in kernel space.
>
> Please kindly help to comment whether above scenario is possible.
>
> If there is some bug in the caller logic, Can u help to point it out? :)

Well what could be is that the EPOLL implementation is broken somehow, 
but I have really doubts on that.

As I said before the most likely explanation is that you have a broken 
device driver which messes up the DMA-buf file reference count somehow.

Regards,
Christian.

>
> Regards,
> Zhiguo
>>
>>>
>>>>
>>>> This reference is only dropped after the callback is completed in
>>>> dma_buf_poll_cb():
>>>>
>>>>         /* Paired with get_file in dma_buf_poll */
>>>>         fput(dmabuf->file);
>>>>
>>>> So your explanation for the issue just seems to be incorrect.
>>>>
>>>>>
>>>>> 4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it
>>>>> still resides in one epoll_list.
>>>>>   Although __fput will call eventpoll_release to remove the file from
>>>>> binded epoll list,
>>>>>   but it has small time window where Thread B jumps in.
>>>>
>>>> Well if that is really the case then that would then be a bug in
>>>> epoll_ctl().
>>>>
>>>>>
>>>>> 5. During the small window, Thread B do the poll action for
>>>>> dma_buf_fd, where it will fget/fput for the file,
>>>>>   this means the fd refcount will be 0 -> 1 -> 0, and it will call
>>>>> __fput again.
>>>>>   This will lead to __fput twice for the same file.
>>>>>
>>>>> 6. So the potenial fix is use get_file_rcu which to check if file
>>>>> refcount already zero which means under free.
>>>>>   If so, we just return and no need to do the dma_buf_poll.
>>>>
>>>> Well to say it bluntly as far as I can see this suggestion is 
>>>> completely
>>>> nonsense.
>>>>
>>>> When the reference to the file goes away while dma_buf_poll() is
>>>> executed then that's a massive bug in the caller of that function.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> Here is the race condition:
>>>>>
>>>>> Thread A Thread B
>>>>> dma_buf_export
>>>>> fd_refcount is 1
>>>>> epoll_ctl(EPOLL_ADD)
>>>>> add dma_buf_fd to epoll list
>>>>> close(dma_buf_fd)
>>>>> fd_refcount is 0
>>>>> __fput
>>>>> dma_buf_poll
>>>>> fget
>>>>> fput
>>>>> fd_refcount is zero again
>>>>>
>>>>> Thanks
>>>>>
>>>>
>>>
>>
>



More information about the dri-devel mailing list