[PATCH 2/3] drm/scheduler: Fix UAF in drm_sched_fence_get_timeline_name
Asahi Lina
lina at asahilina.net
Fri Jul 14 09:49:44 UTC 2023
On 14/07/2023 17.43, Christian König wrote:
> Am 14.07.23 um 10:21 schrieb Asahi Lina:
>> A signaled scheduler fence can outlive its scheduler, since fences are
>> independencly reference counted. Therefore, we can't reference the
>> scheduler in the get_timeline_name() implementation.
>>
>> Fixes oopses on `cat /sys/kernel/debug/dma_buf/bufinfo` when shared
>> dma-bufs reference fences from GPU schedulers that no longer exist.
>>
>> Signed-off-by: Asahi Lina <lina at asahilina.net>
>> ---
>> drivers/gpu/drm/scheduler/sched_entity.c | 7 ++++++-
>> drivers/gpu/drm/scheduler/sched_fence.c | 4 +++-
>> include/drm/gpu_scheduler.h | 5 +++++
>> 3 files changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>> index b2bbc8a68b30..17f35b0b005a 100644
>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>> @@ -389,7 +389,12 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
>>
>> /*
>> * Fence is from the same scheduler, only need to wait for
>> - * it to be scheduled
>> + * it to be scheduled.
>> + *
>> + * Note: s_fence->sched could have been freed and reallocated
>> + * as another scheduler. This false positive case is okay, as if
>> + * the old scheduler was freed all of its jobs must have
>> + * signaled their completion fences.
>
> This is outright nonsense. As long as an entity for a scheduler exists
> it is not allowed to free up this scheduler.
>
> So this function can't be called like this.
As I already explained, the fences can outlive their scheduler. That
means *this* entity certainly exists for *this* scheduler, but the
*dependency* fence might have come from a past scheduler that was
already destroyed along with all of its entities, and its address reused.
Christian, I'm really getting tired of your tone. I don't appreciate
being told my comments are "outright nonsense" when you don't even take
the time to understand what the issue is and what I'm trying to
do/document. If you aren't interested in working with me, I'm just going
to give up on drm_sched, wait until Rust gets workqueue support, and
reimplement it in Rust. You can keep your broken fence lifetime
semantics and I'll do my own thing.
~~ Lina
More information about the dri-devel
mailing list