[PATCH v4 02/18] drm/sched: Barriers are needed for entity->last_scheduled

Tue Jul 13 06:50:42 UTC 2021

On Tue, Jul 13, 2021 at 8:35 AM Christian König
<christian.koenig at amd.com> wrote:
>
> Am 12.07.21 um 19:53 schrieb Daniel Vetter:
> > It might be good enough on x86 with just READ_ONCE, but the write side
> > should then at least be WRITE_ONCE because x86 has total store order.
> >
> > It's definitely not enough on arm.
> >
> > Fix this proplery, which means
> > - explain the need for the barrier in both places
> > - point at the other side in each comment
> >
> > Also pull out the !sched_list case as the first check, so that the
> > code flow is clearer.
> >
> > While at it sprinkle some comments around because it was very
> > non-obvious to me what's actually going on here and why.
> >
> > Note that we really need full barriers here, at first I thought
> > store-release and load-acquire on ->last_scheduled would be enough,
> > but we actually requiring ordering between that and the queue state.
> >
> > v2: Put smp_rmp() in the right place and fix up comment (Andrey)
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> > Cc: "Christian König" <christian.koenig at amd.com>
> > Cc: Steven Price <steven.price at arm.com>
> > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
> > Cc: Lee Jones <lee.jones at linaro.org>
> > Cc: Boris Brezillon <boris.brezillon at collabora.com>
> > ---
> >   drivers/gpu/drm/scheduler/sched_entity.c | 27 ++++++++++++++++++++++--
> >   1 file changed, 25 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index f7347c284886..89e3f6eaf519 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> >               dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED);
> >
> >       dma_fence_put(entity->last_scheduled);
> > +
> >       entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);
> >
> > +     /*
> > +      * If the queue is empty we allow drm_sched_entity_select_rq() to
> > +      * locklessly access ->last_scheduled. This only works if we set the
> > +      * pointer before we dequeue and if we a write barrier here.
> > +      */
> > +     smp_wmb();
> > +
>
> Again, conceptual those barriers should be part of the spsc_queue
> container and not externally.

That would be extremely unusual api. Let's assume that your queue is
very dumb, and protected by a simple lock. That's about the maximum
any user could expect.

But then you still need barriers here, because linux locks (spinlock,
mutex) are defined to be one-way barriers: Stuff that's inside is
guaranteed to be done insinde, but stuff outside of the locked region
can leak in. They're load-acquire/store-release barriers. So not good
enough.

You really need to have barriers here, and they really all need to be
documented properly. And yes that's a shit-ton of work in drm/sched,
because it's full of yolo lockless stuff.

The other case you could make is that this works like a wakeup queue,
or similar. The rules there are:
- wake_up (i.e. pushing something into the queue) is a store-release barrier
- the waked up (i.e. popping an entry) is a load acquire barrier
Which is obviuosly needed because otherwise you don't have coherency
for the data queued up. And again not the barriers you're locking for
here.

Either way, we'd still need the comments, because it's still lockless
trickery, and every single one of that needs to have a comment on both
sides to explain what's going on.

Essentially replace spsc_queue with an llist underneath, and that's
the amount of barriers a data structure should provide. Anything else
is asking your datastructure to paper over bugs in your users.

This is similar to how atomic_t is by default completely unordered,
and users need to add barriers as needed, with comments. I think this
is all to make sure people don't just write lockless algorithms
because it's a cool idea, but are forced to think this all through.
Which seems to not have happened very consistently for drm/sched, so I
guess needs to be fixed.

I'm definitely not going to hide all that by making the spsc_queue
stuff provide random unjustified barriers just because that would
paper over drm/sched bugs. We need to fix the actual bugs, and
preferrable all of them. I've found a few, but I wasn't involved in
drm/sched thus far, so best I can do is discover them as we go.
-Daniel

> Regards,
> Christian.
>
> >       spsc_queue_pop(&entity->job_queue);
> >       return sched_job;
> >   }
> > @@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >       struct drm_gpu_scheduler *sched;
> >       struct drm_sched_rq *rq;
> >
> > -     if (spsc_queue_count(&entity->job_queue) || !entity->sched_list)
> > +     /* single possible engine and already selected */
> > +     if (!entity->sched_list)
> > +             return;
> > +
> > +     /* queue non-empty, stay on the same engine */
> > +     if (spsc_queue_count(&entity->job_queue))
> >               return;
> >
> > -     fence = READ_ONCE(entity->last_scheduled);
> > +     /*
> > +      * Only when the queue is empty are we guaranteed that the scheduler
> > +      * thread cannot change ->last_scheduled. To enforce ordering we need
> > +      * a read barrier here. See drm_sched_entity_pop_job() for the other
> > +      * side.
> > +      */
> > +     smp_rmb();
> > +
> > +     fence = entity->last_scheduled;
> > +
> > +     /* stay on the same engine if the previous job hasn't finished */
> >       if (fence && !dma_fence_is_signaled(fence))
> >               return;
> >
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch