[Intel-gfx] [CI 06/20] drm/i915: Slaughter the thundering i915_wait_request herd

Mon Jun 6 10:14:37 UTC 2016

On Mon, May 23, 2016 at 09:53:40AM +0100, Tvrtko Ursulin wrote:
> 
> On 20/05/16 13:19, Chris Wilson wrote:
> >On Fri, May 20, 2016 at 01:04:13PM +0100, Tvrtko Ursulin wrote:
> >>>+	p = &b->waiters.rb_node;
> >>>+	while (*p) {
> >>>+		parent = *p;
> >>>+		if (wait->seqno == to_wait(parent)->seqno) {
> >>>+			/* We have multiple waiters on the same seqno, select
> >>>+			 * the highest priority task (that with the smallest
> >>>+			 * task->prio) to serve as the bottom-half for this
> >>>+			 * group.
> >>>+			 */
> >>>+			if (wait->task->prio > to_wait(parent)->task->prio) {
> >>>+				p = &parent->rb_right;
> >>>+				first = false;
> >>>+			} else
> >>>+				p = &parent->rb_left;
> >>>+		} else if (i915_seqno_passed(wait->seqno,
> >>>+					     to_wait(parent)->seqno)) {
> >>>+			p = &parent->rb_right;
> >>>+			if (i915_seqno_passed(seqno, to_wait(parent)->seqno))
> >>>+				completed = parent;
> >>
> >>Hm don't you need the completed handling in the equal seqno case as well?
> >
> >i915_seqno_passed() returnts true if seqnoA == seqnoB
> 
> I meant in the first branch, multiple waiter on the same seqno?

It could be applied there as well. It is just a bit hairer to keep the
rbtree and call semantics intact. Hmm, for ourselves it is probably
better to early return and late the pending wakeup handle the rbtree.

> >>>+void intel_engine_remove_wait(struct intel_engine_cs *engine,
> >>>+			      struct intel_wait *wait)
> >>>+{
> >>>+	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> >>>+
> >>>+	/* Quick check to see if this waiter was already decoupled from
> >>>+	 * the tree by the bottom-half to avoid contention on the spinlock
> >>>+	 * by the herd.
> >>>+	 */
> >>>+	if (RB_EMPTY_NODE(&wait->node))
> >>>+		return;
> >>>+
> >>>+	spin_lock(&b->lock);
> >>>+
> >>>+	if (b->first_waiter == wait->task) {
> >>>+		struct rb_node *next;
> >>>+		struct task_struct *task;
> >>>+		const int priority = wait->task->prio;
> >>>+
> >>>+		/* We are the current bottom-half. Find the next candidate,
> >>>+		 * the first waiter in the queue on the remaining oldest
> >>>+		 * request. As multiple seqnos may complete in the time it
> >>>+		 * takes us to wake up and find the next waiter, we have to
> >>>+		 * wake up that waiter for it to perform its own coherent
> >>>+		 * completion check.
> >>>+		 */
> >>>+		next = rb_next(&wait->node);
> >>>+		if (chain_wakeup(next, priority)) {
> >>
> >>Don't get this, next waiter my be a different seqno so how is the
> >>priority check relevant?
> >
> >We only want to call try_to_wake_up() on processes at least as important
> >as us.
> 
> But what is the comment saying about having to wake up the following
> waiter, because of multiple seqnos potentially completing? It says
> that and then it may not wake up anyone depending on relative
> priorities.

It does, it always wakes the next bottom half.

> >>Also, how can the next node be smaller priority anyway, if equal
> >>seqno if has be equal or greater I thought?
> >
> >Next seqno can have a smaller priority (and be complete). chain_wakeup()
> >just asks if the next waiter is as important as ourselves (the next
> >waiter may be for a different seqno).
> >
> >>Then since add_waiter will wake up one extra waiter to handle the
> >>missed irq case, here you may skip checking them based simply on
> >>task priority which seems wrong.
> >
> >Correct. They will be handled by the next waiter in the qeueue, not us.
> >Our job is to wake as many completed (+ the potentially completed bh) as
> >possible without incurring undue delay for ourselves. All completed waiters
> >will be woken in turn as the next bh to run will look at the list and
> >wake up those at the the same priority as itself (+ the next potentially
> >completed bh).
> 
> Who will wake up the next waiter? add_waiter will wake up one, and
> then the next waiter here (in remove_waiter) may not wake up any
> further based on priority.

We do. We wake up all waiters that are complete at the same priority or
better than us, plus the next waiter to take over as the bottom half.

> Is the assumption that it is only possible to miss one interrupt?
> 
> >>>+static inline bool intel_engine_wakeup(struct intel_engine_cs *engine)
> >>>+{
> >>>+	bool wakeup = false;
> >>>+	struct task_struct *task = READ_ONCE(engine->breadcrumbs.first_waiter);
> >>>+	/* Note that for this not to dangerously chase a dangling pointer,
> >>>+	 * the caller is responsible for ensure that the task remain valid for
> >>>+	 * wake_up_process() i.e. that the RCU grace period cannot expire.
> >>>+	 */
> >>
> >>This gets called from hard irq context and I did not manage to
> >>figure out what makes it safe in the remove waiter path? Why the
> >>hard irq couldn't not sample NULL here?
> >
> >Because we only need RCU access to task, which is provided by the hard
> >irq context.
> 
> What does the comment mean by saying callers are responsible for the
> RCU period? From which point to which? I still can't tie whatever
> callers might be doing with the unrelated hardirq.

Ensuring that the preempt count is raised to prevent the RCU grace
period from expiring between the read of task and the *task, i.e.
rcu_read_lock() or other means.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre