[Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock

Chris Wilson chris at chris-wilson.co.uk
Mon Jan 25 21:37:41 UTC 2021


Quoting Tvrtko Ursulin (2021-01-25 15:34:53)
> 
> On 25/01/2021 14:00, Chris Wilson wrote:
> > @@ -390,24 +410,27 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
> >   {
> >       bool ret = false;
> >   
> > -     spin_lock_irq(&schedule_lock);
> > +     /* The signal->lock is always the outer lock in this double-lock. */
> > +     spin_lock(&signal->lock);
> >   
> >       if (!node_signaled(signal)) {
> >               INIT_LIST_HEAD(&dep->dfs_link);
> >               dep->signaler = signal;
> > -             dep->waiter = node;
> > +             dep->waiter = node_get(node);
> >               dep->flags = flags;
> >   
> >               /* All set, now publish. Beware the lockless walkers. */
> > +             spin_lock_nested(&node->lock, SINGLE_DEPTH_NESTING);
> >               list_add_rcu(&dep->signal_link, &node->signalers_list);
> >               list_add_rcu(&dep->wait_link, &signal->waiters_list);
> > +             spin_unlock(&node->lock);
> >   
> >               /* Propagate the chains */
> >               node->flags |= signal->flags;
> >               ret = true;
> >       }
> >   
> > -     spin_unlock_irq(&schedule_lock);
> > +     spin_unlock(&signal->lock);
> 
> So we have to be sure another entry point cannot try to lock the same 
> nodes in reverse, that is with reversed roles. Situation where nodes are 
> simultaneously both each other waiters and signalers does indeed sound 
> impossible so I think this is fine.
> 
> Only if some entry point would lock something which is a waiter, and 
> then went to boost the priority of a signaler. That is still one with a 
> global lock. So the benefit of this patch is just to reduce contention 
> between adding and re-scheduling?

We remove the global schedule_lock in the next patch. This patch tackles
the "simpler" list management by noting that the chains can always be
taken in order of (signaler, waiter) so we have strict nesting for a
local double lock.

> And __i915_schedule does walk the list of signalers without holding this 
> new lock. What is the safety net there? RCU? Do we need 
> list_for_each_entry_rcu and explicit rcu_read_(un)lock in there then?

Yes, we are already supposedly RCU safe for the list of signalers, as
we've been depending on that for a while.

#define for_each_signaler(p__, rq__) \
        list_for_each_entry_rcu(p__, \
                                &(rq__)->sched.signalers_list, \
                                signal_link)

-Chris


More information about the Intel-gfx mailing list