[PATCH 1/3] drm/sched: add drm_sched_prealloc_dependency_slots v3

Fri May 16 17:16:44 UTC 2025

On Fri, 2025-05-16 at 15:30 +0100, Tvrtko Ursulin wrote:
> 
> On 16/05/2025 14:38, Philipp Stanner wrote:
> > On Fri, 2025-05-16 at 13:10 +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 16/05/2025 12:53, Tvrtko Ursulin wrote:
> > > > 
> > > > On 16/05/2025 08:28, Philipp Stanner wrote:
> > > > > On Thu, 2025-05-15 at 17:17 +0100, Tvrtko Ursulin wrote:
> > > > > > 
> > > > > > On 15/05/2025 16:00, Christian König wrote:
> > > > > > > Sometimes drivers need to be able to submit multiple jobs
> > > > > > > which
> > > > > > > depend on
> > > > > > > each other to different schedulers at the same time, but
> > > > > > > using
> > > > > > > drm_sched_job_add_dependency() can't fail any more after
> > > > > > > the
> > > > > > > first
> > > > > > > job is
> > > > > > > initialized.
> > > > > > > 
> > > > > > > This function preallocate memory for dependency slots so
> > > > > > > that
> > > > > > > no
> > > > > > > ENOMEM
> > > > > > > can come later while adding dependencies.
> > > > > > > 
> > > > > > > v2: rework implementation an documentation
> > > > > > > v3: rework from scratch, use separate function to add
> > > > > > > preallocated
> > > > > > > deps
> > > > > 
> > > > > I think we agreed to not put change logs into commit messages
> > > > > anymore
> > > > > :)
> > > > > 
> > > > > They aren't useful for any reader. Who needs the changelog
> > > > > afterwards
> > > > > can retreive it through the mail thread link that we add.
> > > > > 
> > > > > > > 
> > > > > > > Signed-off-by: Christian König <christian.koenig at amd.com>
> > > > > > > ---
> > > > > > >     drivers/gpu/drm/scheduler/sched_main.c | 45
> > > > > > > ++++++++++++++++++++++++++
> > > > > > >     include/drm/gpu_scheduler.h            |  4 +++
> > > > > > >     2 files changed, 49 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > index f7118497e47a..b95e7089aa70 100644
> > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > @@ -858,6 +858,51 @@ void drm_sched_job_arm(struct
> > > > > > > drm_sched_job
> > > > > > > *job)
> > > > > > >     }
> > > > > > >     EXPORT_SYMBOL(drm_sched_job_arm);
> > > > > > > +/**
> > > > > > > + * drm_sched_job_prealloc_dependency_slot - avoid ENOMEM
> > > > > > > on
> > > > > > > adding
> > > > > > > dependencies
> > > > > > > + * @job: scheduler job where dependencies will be added
> > > > > > > + * @id: id for the allocated slot
> > > > > > > +  *
> > > > > > > + * Sometimes drivers need to be able to submit multiple
> > > > > > > jobs
> > > > > > > which
> > > > > > > depend on
> > > > > > > + * each other to different schedulers at the same time,
> > > > > > > but
> > > > > > > using
> > > > > > > + * drm_sched_job_add_dependency() can't fail any more
> > > > > > > after
> > > > > > > the
> > > > > > > first job is
> > > > > > > + * initialized.
> > > > > > > + *
> > > > > > > + * This function preallocate memory for a dependency
> > > > > > > slot so
> > > > > > > that
> > > > > > > no ENOMEM can
> > > > > > > + * come later while adding dependencies. The index of
> > > > > > > the
> > > > > > > preallocated slot is
> > > > > > > + * returned in @id.
> > > > > > > + *
> > > > > > > + * Return:
> > > > > > > + * 0 on success, or an error on failing to expand the
> > > > > > > array.
> > > > > > > + */
> > > > > > > +int drm_sched_job_prealloc_dependency_slot(struct
> > > > > > > drm_sched_job
> > > > > > > *job,
> > > > > > > +                       u32 *id)
> > > > > > > +{
> > > > > > > +    return xa_alloc(&job->dependencies, id, NULL,
> > > > > > > xa_limit_32b, GFP_KERNEL);
> > > > > > > +}
> > > > > > > +EXPORT_SYMBOL(drm_sched_job_prealloc_dependency_slot);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * drm_sched_job_add_prealloc_dep - add dependency to
> > > > > > > preallocated
> > > > > > > slot
> > > > > > > + * @job: scheduler job where dependencies will be added
> > > > > > > + * @id: the preallocated slot index
> > > > > > > + * @fence: the dependency to add
> > > > > > > + *
> > > > > > > + * Consumes @fence and adds it to the preallocated slot
> > > > > > > dependency.
> > > > > > > + */
> > > > > > > +void drm_sched_job_add_prealloc_dep(struct drm_sched_job
> > > > > > > *job, u32
> > > > > > > id,
> > > > > > > +                    struct dma_fence *fence)
> > > > > > > +{
> > > > > > > +    fence = xa_store(&job->dependencies, id, fence,
> > > > > > > GFP_ATOMIC);
> > > > > > 
> > > > > > Add assert that the passed id exists (was preallocated) and
> > > > > > is
> > > > > > NULL?
> > > > > 
> > > > > You
> > > > 
> > > > Hm?
> > > > 
> > > > > > 
> > > > > > Also, if someone preallocates and does not consume the slot
> > > > > > will that
> > > > > > confuse the iteration in drm_sched_job_dependency()?
> > > > > 
> > > > > drm_sched_job_add_dependency() you mean.
> > > > 
> > > > I was actually thinking of drm_sched_job_dependency() because
> > > > that
> > > > looked it would skip dependencies upon encountering an
> > > > unconsumed
> > > > preallocated slot, but yes, drm_sched_job_add_dependency()
> > > > could
> > > > explode
> > > > even earlier if adding a normal dependency after preallocating
> > > > a
> > > > slot.
> > > > 
> > > > > Yes, it would. All operations simply give you NULL for those
> > > > > slots. So
> > > > > seems to me you have to check for NULL wherever a
> > > > > preallocated
> > > > > slot
> > > > > might drop out. That would then be a bug.
> > > > > 
> > > > > It's kind of tricky, all that. It's a pity that Wilcox didn't
> > > > > answer
> > > > > our questions about the idiomatic way to do it.
> > > > > 
> > > > > Maybe reserving slots with already signaled fences wasn't
> > > > > such a
> > > > > bad
> > > > > idea after all?
> > > > > 
> > > > > If we go for the NULL approach, it's probably the only sane
> > > > > way
> > > > > to then
> > > > > check for NULL wherever dependencies are accessed :(
> > > > > 
> > > > > Opinions?
> > > > 
> > > > Well if the xarray API returns the NULL consistently the
> > > > approach
> > > > from
> > > > this patch is fine I think.
> > > > 
> > > > We just need to add two more checks to the above mentioned
> > > > functions,
> > > 
> > > I need to correct myself, drm_sched_job_dependency() wouldn't be
> > > able
> > > to
> > > just skip NULLs since it relies on NULL for "no more
> > > dependencies".
> > > We
> > > would need to track something like job->max_dependency and
> > > terminate
> > > on
> > > job->last_dependency > job->max_dependency or so.
> > 
> > Agreed, that would have to be fixed.
> > 
> > I believe we should reconsider Christian's first idea [1].
> > 
> > Thinking about it some more:
> >   * With the NULL version, suddenly the xarray containing only
> > valid
> >     dependencies can sometimes contain NULL entries.
> >   * If we could create our own tag, entries could be returned that
> > were
> >     neither NULL nor valid fences, also requiring checks
> > 'everywhere'.
> >   * Only the "signaled fence as prealloc reservation" approach is
> > fully
> >     backwards compatible and will never cause anyone to block after
> >     later reworks.
> > 
> > So maybe it's actually the best idea?
> > 
> > Sorry for the zigg-zagg. No hard requirements intended from my
> > side,
> > I'm willing to go with what you guys think.
> > 
> > Just saying, at least now I think that the already-signaled fence
> > seems
> > the most elegant solution. And since there's a function
> > (dma_fence_get_stub()) for that, it seems to be in alignment with
> > official dma_fence rules.
> 
> Potential problem there was dma_fence_is_signaled() and fence
> signaling 
> annotations. In case some driver is holding a lock over the arm+push 
> pair. I wish we had a non-signaling is_signaled helper..
> 

Yes! +1!

But Christian doesn't like that direction:

https://lore.kernel.org/all/20250409120640.106408-2-phasta@kernel.org/

P.

> 
> 

> 
> 
> Anyway, I think both options are passable. I even like the NULL entry
> slightly more since it is simpler in a way and I don't mind some
> extra 
> checks completely hidden in scheduler internals.
> 
> Regards,
> 
> Tvrtko
> 
> > 
> > 
> > Philipp
> > 
> > 
> > [1]
> > https://lore.kernel.org/all/20250318120313.19099-2-christian.koenig@amd.com
> > /
> > 
> > 
> > > 
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > some more unit tests probably to make sure, and that should be
> > > > fine
> > > > for
> > > > now.
> > > > 
> > > > On the bikeshedding front I would perhaps suggest:
> > > > 
> > > >    - drm_sched_job_preallocate_dependency()
> > > >    - drm_sched_job_replace_dependency()
> > > > 
> > > > Reads a little bit more aligned with the rest of the API and a
> > > > bit
> > > > easier on the eyes, to my eyes at least.
> > > > 
> > > > Regards,
> > > > 
> > > > Tvrtko
> > > > 
> > > 
> > 
>