[PATCH] drm/sched/tests: Use one lock for fence context

Thu May 22 14:06:27 UTC 2025

On Wed, 2025-05-21 at 11:24 +0100, Tvrtko Ursulin wrote:
> 
> On 21/05/2025 11:04, Philipp Stanner wrote:
> > When the unit tests were implemented, each scheduler job got its
> > own,
> > distinct lock. This is not how dma_fence context locking rules are
> > to be
> > implemented. All jobs belonging to the same fence context (in this
> > case:
> > scheduler) should share a lock for their dma_fences. This is to
> > comply
> > to various dma_fence rules, e.g., ensuring that only one fence gets
> > signaled at a time.
> > 
> > Use the fence context (scheduler) lock for the jobs.
> 
> I think for the mock scheduler it works to share the lock, but I
> don't 
> think see that the commit message is correct. Where do you see the 
> requirement to share the lock? AFAIK fence->lock is a fence lock, 
> nothing more semantically.

This patch is in part to probe a bit with Christian and Danilo to see
whether we can get a bit more clarity about it.

In many places, notably Nouveau, it's definitely well established
practice to use one lock for the fctx and all the jobs associated with
it.

> 
> And what does "ensuring that only one fence gets signalled at a time"
> mean? You mean signal in seqno order? 

Yes. But that's related. If jobs' fences can get signaled indepently
from each other, that might race and screw up ordering. A common lock
can prevent that.

> Even that is not guaranteed in the 
> contract due opportunistic signalling.

Jobs must be submitted to the hardware in the order they were
submitted, and, therefore, their fences must be signaled in order. No?

What do you mean by opportunistic signaling?

P.

> 
> Regards,
> 
> Tvrtko
> 
> > Signed-off-by: Philipp Stanner <phasta at kernel.org>
> > ---
> >   drivers/gpu/drm/scheduler/tests/mock_scheduler.c | 5 ++---
> >   drivers/gpu/drm/scheduler/tests/sched_tests.h    | 1 -
> >   2 files changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > index f999c8859cf7..17023276f4b0 100644
> > --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > @@ -64,7 +64,7 @@ static void drm_mock_sched_job_complete(struct
> > drm_mock_sched_job *job)
> >   
> >   	job->flags |= DRM_MOCK_SCHED_JOB_DONE;
> >   	list_move_tail(&job->link, &sched->done_list);
> > -	dma_fence_signal(&job->hw_fence);
> > +	dma_fence_signal_locked(&job->hw_fence);
> >   	complete(&job->done);
> >   }
> >   
> > @@ -123,7 +123,6 @@ drm_mock_sched_job_new(struct kunit *test,
> >   	job->test = test;
> >   
> >   	init_completion(&job->done);
> > -	spin_lock_init(&job->lock);
> >   	INIT_LIST_HEAD(&job->link);
> >   	hrtimer_setup(&job->timer,
> > drm_mock_sched_job_signal_timer,
> >   		      CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
> > @@ -169,7 +168,7 @@ static struct dma_fence
> > *mock_sched_run_job(struct drm_sched_job *sched_job)
> >   
> >   	dma_fence_init(&job->hw_fence,
> >   		       &drm_mock_sched_hw_fence_ops,
> > -		       &job->lock,
> > +		       &sched->lock,
> >   		       sched->hw_timeline.context,
> >   		       atomic_inc_return(&sched-
> > >hw_timeline.next_seqno));
> >   
> > diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h
> > b/drivers/gpu/drm/scheduler/tests/sched_tests.h
> > index 27caf8285fb7..fbba38137f0c 100644
> > --- a/drivers/gpu/drm/scheduler/tests/sched_tests.h
> > +++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h
> > @@ -106,7 +106,6 @@ struct drm_mock_sched_job {
> >   	unsigned int		duration_us;
> >   	ktime_t			finish_at;
> >   
> > -	spinlock_t		lock;
> >   	struct dma_fence	hw_fence;
> >   
> >   	struct kunit		*test;
>