[PATCH 1/3] drm/sched: add drm_sched_prealloc_dependency_slots v3

Tvrtko Ursulin tursulin at ursulin.net
Tue May 20 16:15:38 UTC 2025


On 19/05/2025 10:04, Philipp Stanner wrote:
> On Mon, 2025-05-19 at 09:51 +0100, Tvrtko Ursulin wrote:
>>
>> On 16/05/2025 18:16, Philipp Stanner wrote:
>>> On Fri, 2025-05-16 at 15:30 +0100, Tvrtko Ursulin wrote:
>>>>
>>>> On 16/05/2025 14:38, Philipp Stanner wrote:
>>>>> On Fri, 2025-05-16 at 13:10 +0100, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 16/05/2025 12:53, Tvrtko Ursulin wrote:
>>>>>>>
>>>>>>>
> 
> [snip]
> 
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_sched_job_add_prealloc_dep - add dependency
>>>>>>>>>> to
>>>>>>>>>> preallocated
>>>>>>>>>> slot
>>>>>>>>>> + * @job: scheduler job where dependencies will be
>>>>>>>>>> added
>>>>>>>>>> + * @id: the preallocated slot index
>>>>>>>>>> + * @fence: the dependency to add
>>>>>>>>>> + *
>>>>>>>>>> + * Consumes @fence and adds it to the preallocated
>>>>>>>>>> slot
>>>>>>>>>> dependency.
>>>>>>>>>> + */
>>>>>>>>>> +void drm_sched_job_add_prealloc_dep(struct
>>>>>>>>>> drm_sched_job
>>>>>>>>>> *job, u32
>>>>>>>>>> id,
>>>>>>>>>> +                    struct dma_fence *fence)
>>>>>>>>>> +{
>>>>>>>>>> +    fence = xa_store(&job->dependencies, id, fence,
>>>>>>>>>> GFP_ATOMIC);
>>>>>>>>>
>>>>>>>>> Add assert that the passed id exists (was preallocated)
>>>>>>>>> and
>>>>>>>>> is
>>>>>>>>> NULL?
>>>>>>>>
>>>>>>>> You
>>>>>>>
>>>>>>> Hm?
>>>>>>>
>>>>>>>>>
>>>>>>>>> Also, if someone preallocates and does not consume the
>>>>>>>>> slot
>>>>>>>>> will that
>>>>>>>>> confuse the iteration in drm_sched_job_dependency()?
>>>>>>>>
>>>>>>>> drm_sched_job_add_dependency() you mean.
>>>>>>>
>>>>>>> I was actually thinking of drm_sched_job_dependency()
>>>>>>> because
>>>>>>> that
>>>>>>> looked it would skip dependencies upon encountering an
>>>>>>> unconsumed
>>>>>>> preallocated slot, but yes, drm_sched_job_add_dependency()
>>>>>>> could
>>>>>>> explode
>>>>>>> even earlier if adding a normal dependency after
>>>>>>> preallocating
>>>>>>> a
>>>>>>> slot.
>>>>>>>
>>>>>>>> Yes, it would. All operations simply give you NULL for
>>>>>>>> those
>>>>>>>> slots. So
>>>>>>>> seems to me you have to check for NULL wherever a
>>>>>>>> preallocated
>>>>>>>> slot
>>>>>>>> might drop out. That would then be a bug.
>>>>>>>>
>>>>>>>> It's kind of tricky, all that. It's a pity that Wilcox
>>>>>>>> didn't
>>>>>>>> answer
>>>>>>>> our questions about the idiomatic way to do it.
>>>>>>>>
>>>>>>>> Maybe reserving slots with already signaled fences wasn't
>>>>>>>> such a
>>>>>>>> bad
>>>>>>>> idea after all?
>>>>>>>>
>>>>>>>> If we go for the NULL approach, it's probably the only
>>>>>>>> sane
>>>>>>>> way
>>>>>>>> to then
>>>>>>>> check for NULL wherever dependencies are accessed :(
>>>>>>>>
>>>>>>>> Opinions?
>>>>>>>
>>>>>>> Well if the xarray API returns the NULL consistently the
>>>>>>> approach
>>>>>>> from
>>>>>>> this patch is fine I think.
>>>>>>>
>>>>>>> We just need to add two more checks to the above mentioned
>>>>>>> functions,
>>>>>>
>>>>>> I need to correct myself, drm_sched_job_dependency() wouldn't
>>>>>> be
>>>>>> able
>>>>>> to
>>>>>> just skip NULLs since it relies on NULL for "no more
>>>>>> dependencies".
>>>>>> We
>>>>>> would need to track something like job->max_dependency and
>>>>>> terminate
>>>>>> on
>>>>>> job->last_dependency > job->max_dependency or so.
>>>>>
>>>>> Agreed, that would have to be fixed.
>>>>>
>>>>> I believe we should reconsider Christian's first idea [1].
>>>>>
>>>>> Thinking about it some more:
>>>>>     * With the NULL version, suddenly the xarray containing only
>>>>> valid
>>>>>       dependencies can sometimes contain NULL entries.
>>>>>     * If we could create our own tag, entries could be returned
>>>>> that
>>>>> were
>>>>>       neither NULL nor valid fences, also requiring checks
>>>>> 'everywhere'.
>>>>>     * Only the "signaled fence as prealloc reservation" approach
>>>>> is
>>>>> fully
>>>>>       backwards compatible and will never cause anyone to block
>>>>> after
>>>>>       later reworks.
>>>>>
>>>>> So maybe it's actually the best idea?
>>>>>
>>>>> Sorry for the zigg-zagg. No hard requirements intended from my
>>>>> side,
>>>>> I'm willing to go with what you guys think.
>>>>>
>>>>> Just saying, at least now I think that the already-signaled
>>>>> fence
>>>>> seems
>>>>> the most elegant solution. And since there's a function
>>>>> (dma_fence_get_stub()) for that, it seems to be in alignment
>>>>> with
>>>>> official dma_fence rules.
>>>>
>>>> Potential problem there was dma_fence_is_signaled() and fence
>>>> signaling
>>>> annotations. In case some driver is holding a lock over the
>>>> arm+push
>>>> pair. I wish we had a non-signaling is_signaled helper..
>>>>
>>>
>>> Yes! +1!
>>>
>>> But Christian doesn't like that direction:
>>>
>>> https://lore.kernel.org/all/20250409120640.106408-2-phasta@kernel.org/
>>
>> Thanks, I read this but ended up uncertain on the conclusion.
>>
>> For instance Christian at the end comments like this:
>>
>> """
>> You can test the flag if you know what the fence means to you, that
>> is
>> not a problem at all.
>> """
>>
>> That was in the context of testing the signaled bit without
>> opportunistic signaling.
>>
>> For me, from the scheduler dependencies side, that should exactly
>> apply.
>> Scheduler knows it does not need to add a signaled fence to the dep
>> array so AFAICS it is fine to skip it. And it may easily be
>> opportunistic signaling ends up a problem for the scheduler.
>>
>> So maybe such helper would be okay after all.
> 
> The thing is that, if I understand him correctly, Christian doesn't
> want a helper. He wants "us" to just use test_bit().

I suspected it may be only about not renaming into some variant of 
check_and_signal etc. With which I agree with, that the name should be 
left alone, because...

> My point is just that dma_fence_is_signaled() is a horrible name.

... as dodgy as it is, and despite me also failing to really understand 
the reason why it _has_ to be have opportunistic signaling, I think 
those semantics are by now pretty much ingrained into people dealing 
with this API so I would leave it as is.

What I was thinking was the double underscore version of the helper and 
some kerneldoc to say when it is safe to use. For me a helper is better 
than poking directly.

> The function pci_is_enabled() tells you whether the PCI device is
> enabled. What it doesn't do is
> 
> bool pci_is_enabled(pdev)
> {
>     if (crazy_callback_is_implemented()) {
>        pci_enable_device();
>        return true;
>     }
> 
>    ...
> }
> 
> It's not intuitive that a function called "{something}_is_signaled()"
> does signal that thing. Although I get that the syntactical idea
> probably is that from the GPUs POV the fence is already signaled when
> this or that seqno has been reached.
> 
> Anyways, judging aside, if a wrapper for test_bit(dma_fence) were
> acceptable, then it would need to steal dma_fence_is_signaled()'s name,
> and dma_fence_is_signaled() would have to get a new name. Which is
> precisely what was rejected, as I see it.

__dma_fence_is_signaled() is what I would do and leave the existing one 
as is.

Regards,

Tvrtko

>> Or if the concern is helper might encourage some potentially unsafe
>> usage, in that case it should come with kerneldoc describing. It is
>> not
>> like review is guaranteed to catch someone using test_bit directly
>> anyway so for me, on balance, helper is always better.
>>
>> Regards,
>>
>> Tvrtko
>>
>>
>>
>>
>>>
>>> P.
>>>
>>>>
>>>>
>>>
>>>
>>>>
>>>>
>>>> Anyway, I think both options are passable. I even like the NULL
>>>> entry
>>>> slightly more since it is simpler in a way and I don't mind some
>>>> extra
>>>> checks completely hidden in scheduler internals.
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>>>
>>>>>
>>>>>
>>>>> Philipp
>>>>>
>>>>>
>>>>> [1]
>>>>> https://lore.kernel.org/all/20250318120313.19099-2-christian.koenig@amd.com
>>>>> /
>>>>>
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Tvrtko
>>>>>>
>>>>>>> some more unit tests probably to make sure, and that should
>>>>>>> be
>>>>>>> fine
>>>>>>> for
>>>>>>> now.
>>>>>>>
>>>>>>> On the bikeshedding front I would perhaps suggest:
>>>>>>>
>>>>>>>      - drm_sched_job_preallocate_dependency()
>>>>>>>      - drm_sched_job_replace_dependency()
>>>>>>>
>>>>>>> Reads a little bit more aligned with the rest of the API
>>>>>>> and a
>>>>>>> bit
>>>>>>> easier on the eyes, to my eyes at least.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Tvrtko
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 



More information about the dri-devel mailing list