[Intel-gfx] [PATCH 17/21] drm/i915: Convert trace-irq to the breadcrumb waiter

Wed Jun 8 09:48:09 UTC 2016

On Tue, Jun 07, 2016 at 01:04:22PM +0100, Tvrtko Ursulin wrote:
> >+static int intel_breadcrumbs_signaler(void *arg)
> >+{
> >+	struct intel_engine_cs *engine = arg;
> >+	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> >+	struct signal *signal;
> >+
> >+	/* Install ourselves with high priority to reduce signalling latency */
> >+	signaler_set_rtpriority();
> >+
> >+	do {
> >+		set_current_state(TASK_INTERRUPTIBLE);
> >+
> >+		/* We are either woken up by the interrupt bottom-half,
> >+		 * or by a client adding a new signaller. In both cases,
> >+		 * the GPU seqno may have advanced beyond our oldest signal.
> >+		 * If it has, propagate the signal, remove the waiter and
> >+		 * check again with the next oldest signal. Otherwise we
> >+		 * need to wait for a new interrupt from the GPU or for
> >+		 * a new client.
> >+		 */
> >+		signal = READ_ONCE(b->first_signal);
> >+		if (signal_complete(signal)) {
> >+			/* Wake up all other completed waiters and select the
> >+			 * next bottom-half for the next user interrupt.
> >+			 */
> >+			intel_engine_remove_wait(engine, &signal->wait);
> >+
> >+			i915_gem_request_unreference(signal->request);
> >+
> >+			/* Find the next oldest signal. Note that as we have
> >+			 * not been holding the lock, another client may
> >+			 * have installed an even older signal than the one
> >+			 * we just completed - so double check we are still
> >+			 * the oldest before picking the next one.
> >+			 */
> >+			spin_lock(&b->lock);
> >+			if (signal == b->first_signal)
> >+				b->first_signal = rb_next(&signal->node);
> >+			rb_erase(&signal->node, &b->signals);
> >+			spin_unlock(&b->lock);
> >+
> >+			kfree(signal);
> >+		} else {
> >+			if (kthread_should_stop())
> >+				break;
> >+
> >+			schedule();
> >+		}
> >+	} while (1);
> >+
> >+	return 0;
> >+}
> 
> So the thread is only because it is convenient to plug it in the
> breadcrumbs infrastructure. Otherwise the processing above could be
> done from a lighter weight context as well since nothing seems to
> need the process context.

No, seqno processing requires process/sleepable context. The delays we
incur can be >100us and not suitable for irq/softirq context.

> One alternative could perhaps be to add a waiter->wake_up vfunc and
> signalers could then potentially use a tasklet?

Hmm, I did find that in order to reduce execlists latency, I had to
drive the tasklet processing from the signaler.

> >+int intel_engine_enable_signaling(struct drm_i915_gem_request *request)
> >+{
> >+	struct intel_engine_cs *engine = request->engine;
> >+	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> >+	struct rb_node *parent, **p;
> >+	struct signal *signal;
> >+	bool first, wakeup;
> >+
> >+	if (unlikely(IS_ERR(b->signaler)))
> >+		return PTR_ERR(b->signaler);
> 
> I don't see that there is a fallback is kthread creation failed. It
> should just fail in intel_engine_init_breadcrumbs if that happens.

Because it is not fatal to using the GPU, just one optional function.

> >+	signal = kmalloc(sizeof(*signal), GFP_ATOMIC);
> 
> Ugh GFP_ATOMIC - why?

Because of dma-buf/fence.c.

> And even should you instead just embed in into requests?

I was resisting embedding even more into requests, so first patch was
for a simpler integration, with a subsequent patch to embed the node
into the request.

> >@@ -329,12 +480,24 @@ void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine)
> >  	setup_timer(&b->fake_irq,
> >  		    intel_breadcrumbs_fake_irq,
> >  		    (unsigned long)engine);
> >+
> >+	/* Spawn a thread to provide a common bottom-half for all signals.
> >+	 * As this is an asynchronous interface we cannot steal the current
> >+	 * task for handling the bottom-half to the user interrupt, therefore
> >+	 * we create a thread to do the coherent seqno dance after the
> >+	 * interrupt and then signal the waitqueue (via the dma-buf/fence).
> >+	 */
> >+	b->signaler = kthread_run(intel_breadcrumbs_signaler,
> >+				  engine, "irq/i915:%d", engine->id);
> 
> As commented above, init should fail here because it cannot run
> without the thread.

We can function without the signaler.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre