[PATCH] drm/sched: Document run_job() refcount hazard
Christian König
christian.koenig at amd.com
Fri Dec 20 15:25:44 UTC 2024
Am 20.12.24 um 15:51 schrieb Danilo Krummrich:
> On Fri, Dec 20, 2024 at 03:11:34PM +0100, Philipp Stanner wrote:
>> On Fri, 2024-12-20 at 14:25 +0100, Christian König wrote:
>>> Am 20.12.24 um 14:18 schrieb Danilo Krummrich:
>>>> On Fri, Dec 20, 2024 at 01:53:34PM +0100, Christian König wrote:
>>>>> Am 20.12.24 um 13:45 schrieb Philipp Stanner:
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index 7ce25281c74c..d6f8df39d848 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> + *
>>>>>> + * @sched_job: the job to run
>>>>>> + *
>>>>>> + * Returns: dma_fence the driver must signal once the
>>>>>> hardware has
>>>>>> + * completed the job ("hardware fence").
>>>>>> + *
>>>>>> + * Note that the scheduler expects to 'inherit' its
>>>>>> own reference to
>>>>>> + * this fence from the callback. It does not invoke an
>>>>>> extra
>>>>>> + * dma_fence_get() on it. Consequently, this callback
>>>>>> must return a
>>>>>> + * fence whose refcount is at least 2: One for the
>>>>>> scheduler's
>>>>>> + * reference returned here, another one for the
>>>>>> reference kept by the
>>>>>> + * driver.
>>>>> Well the driver actually doesn't need any extra reference. The
>>>>> scheduler
>>>>> just needs to guarantee that this reference isn't dropped before
>>>>> it is
>>>>> signaled.
>>>> I think he means the reference the driver's fence context has to
>>>> have in order
>>>> to signal that thing eventually.
>>> Yeah, but this is usually a weak reference. IIRC most drivers don't
>>> increment the reference count for the reference they keep to signal a
>>> fence.
>>>
>>> It's expected that the consumers of the dma_fence keep the fence
>>> alive
>>> at least until it is signaled.
>> So are you saying that the driver having an extra reference (without
>> having obtained it with dma_fence_get()) is not an issue because the
>> driver is the one who will signal the fence [and then be done with it]?
> It's never a "real" issue if you have multiple pointers to a reference counted
> object as long as you can ensure that you hold at least one reference for the
> time you have pointers to the object.
Well, I'm not saying that this isn't an issue. I'm just pointing out
that this is the current practice :)
> But, that's bad design. For every pointer to an object a separate reference
> should be taken.
Yeah, completely agree. Weak references are usually a bad idea if you
don't absolutely need them for something.
Regards,
Christian.
More information about the dri-devel
mailing list