[Intel-gfx] [PATCH 17/21] drm/i915: Convert trace-irq to the breadcrumb waiter
Chris Wilson
chris at chris-wilson.co.uk
Wed Jun 8 09:48:09 UTC 2016
On Tue, Jun 07, 2016 at 01:04:22PM +0100, Tvrtko Ursulin wrote:
> >+static int intel_breadcrumbs_signaler(void *arg)
> >+{
> >+ struct intel_engine_cs *engine = arg;
> >+ struct intel_breadcrumbs *b = &engine->breadcrumbs;
> >+ struct signal *signal;
> >+
> >+ /* Install ourselves with high priority to reduce signalling latency */
> >+ signaler_set_rtpriority();
> >+
> >+ do {
> >+ set_current_state(TASK_INTERRUPTIBLE);
> >+
> >+ /* We are either woken up by the interrupt bottom-half,
> >+ * or by a client adding a new signaller. In both cases,
> >+ * the GPU seqno may have advanced beyond our oldest signal.
> >+ * If it has, propagate the signal, remove the waiter and
> >+ * check again with the next oldest signal. Otherwise we
> >+ * need to wait for a new interrupt from the GPU or for
> >+ * a new client.
> >+ */
> >+ signal = READ_ONCE(b->first_signal);
> >+ if (signal_complete(signal)) {
> >+ /* Wake up all other completed waiters and select the
> >+ * next bottom-half for the next user interrupt.
> >+ */
> >+ intel_engine_remove_wait(engine, &signal->wait);
> >+
> >+ i915_gem_request_unreference(signal->request);
> >+
> >+ /* Find the next oldest signal. Note that as we have
> >+ * not been holding the lock, another client may
> >+ * have installed an even older signal than the one
> >+ * we just completed - so double check we are still
> >+ * the oldest before picking the next one.
> >+ */
> >+ spin_lock(&b->lock);
> >+ if (signal == b->first_signal)
> >+ b->first_signal = rb_next(&signal->node);
> >+ rb_erase(&signal->node, &b->signals);
> >+ spin_unlock(&b->lock);
> >+
> >+ kfree(signal);
> >+ } else {
> >+ if (kthread_should_stop())
> >+ break;
> >+
> >+ schedule();
> >+ }
> >+ } while (1);
> >+
> >+ return 0;
> >+}
>
> So the thread is only because it is convenient to plug it in the
> breadcrumbs infrastructure. Otherwise the processing above could be
> done from a lighter weight context as well since nothing seems to
> need the process context.
No, seqno processing requires process/sleepable context. The delays we
incur can be >100us and not suitable for irq/softirq context.
> One alternative could perhaps be to add a waiter->wake_up vfunc and
> signalers could then potentially use a tasklet?
Hmm, I did find that in order to reduce execlists latency, I had to
drive the tasklet processing from the signaler.
> >+int intel_engine_enable_signaling(struct drm_i915_gem_request *request)
> >+{
> >+ struct intel_engine_cs *engine = request->engine;
> >+ struct intel_breadcrumbs *b = &engine->breadcrumbs;
> >+ struct rb_node *parent, **p;
> >+ struct signal *signal;
> >+ bool first, wakeup;
> >+
> >+ if (unlikely(IS_ERR(b->signaler)))
> >+ return PTR_ERR(b->signaler);
>
> I don't see that there is a fallback is kthread creation failed. It
> should just fail in intel_engine_init_breadcrumbs if that happens.
Because it is not fatal to using the GPU, just one optional function.
> >+ signal = kmalloc(sizeof(*signal), GFP_ATOMIC);
>
> Ugh GFP_ATOMIC - why?
Because of dma-buf/fence.c.
> And even should you instead just embed in into requests?
I was resisting embedding even more into requests, so first patch was
for a simpler integration, with a subsequent patch to embed the node
into the request.
> >@@ -329,12 +480,24 @@ void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine)
> > setup_timer(&b->fake_irq,
> > intel_breadcrumbs_fake_irq,
> > (unsigned long)engine);
> >+
> >+ /* Spawn a thread to provide a common bottom-half for all signals.
> >+ * As this is an asynchronous interface we cannot steal the current
> >+ * task for handling the bottom-half to the user interrupt, therefore
> >+ * we create a thread to do the coherent seqno dance after the
> >+ * interrupt and then signal the waitqueue (via the dma-buf/fence).
> >+ */
> >+ b->signaler = kthread_run(intel_breadcrumbs_signaler,
> >+ engine, "irq/i915:%d", engine->id);
>
> As commented above, init should fail here because it cannot run
> without the thread.
We can function without the signaler.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list