[Intel-gfx] [PATCH] RFC drm/i915: Slaughter the thundering i915_wait_request herd

Mon Nov 2 07:28:22 PST 2015

On Mon, Nov 02, 2015 at 02:57:47PM +0000, Gong, Zhipeng wrote:
> > -----Original Message-----
> > From: Chris Wilson [mailto:chris at chris-wilson.co.uk]
> > Sent: Monday, November 02, 2015 10:16 PM
> > To: Gong, Zhipeng
> > Cc: intel-gfx at lists.freedesktop.org; Rogozhkin, Dmitry V
> > Subject: Re: [PATCH] RFC drm/i915: Slaughter the thundering
> > i915_wait_request herd
> > 
> > On Mon, Nov 02, 2015 at 02:00:47PM +0000, Gong, Zhipeng wrote:
> > > Attach the perf data for BDW async1 and async5 with or without patch.
> > 
> > Hmm, I can see it is the i915_spin_request() consuming the time, but I was
> > hoping to get the callgraph so I could see where the call to i915_wait_request
> > was originating from. Could you regenerate the perf report with -g (or maybe
> > you need a slightly newer perf)?
> > -Chris
> > 
> Chris, I dump the it to txt, but it is hard to read since the call chain is too long.
> If you need the data in other format, please let me know.

Just get a wider monitor :)

Ok, that does confirm that it is the frequent inter-ring sync keeping
the waitqueue short.

If you try,

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 21017daecb90..008b43be791f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -70,6 +70,8 @@ static void intel_engine_irq_wakeup(struct work_struct *work)
                container_of(work, struct intel_engine_cs, irq_work);
        const bool noirq = !__irq_enable(engine);
        DEFINE_WAIT(wait);
+#define KEEPALIVE 10
+       int keepalive = KEEPALIVE;
 
        for (;;) {
                struct timer_list timer;
@@ -115,8 +117,11 @@ static void intel_engine_irq_wakeup(struct work_struct *work)
                }
                engine->irq_first = request;
                spin_unlock(&engine->irq_lock);
-               if (request == NULL)
-                       break;
+               if (request == NULL) {
+                       if (--keepalive == 0)
+                               break;
+               } else
+                       keepalive = KEEPALIVE;
 
                /* Make sure the hangcheck timer is armed in case the GPU
                 * hangs and the request never completes.
@@ -124,7 +129,7 @@ static void intel_engine_irq_wakeup(struct work_struct *work)
                i915_queue_hangcheck(engine->i915);
 
                timer.function = NULL;
-               if (noirq || missed_irq(engine)) {
+               if (request == NULL || noirq || missed_irq(engine)) {
                        setup_timer_on_stack(&timer,
                                             (void (*)(unsigned long))wake_up_process,
                                             (unsigned long)current);
@@ -165,6 +170,8 @@ void intel_engine_add_wakeup(struct drm_i915_gem_request *request)
 
                if (RB_EMPTY_ROOT(&engine->irq_requests))
                        schedule_work(&engine->irq_work);
+               else
+                       wake_up_all(&engine->irq_queue);
 
                init_waitqueue_head(&request->wait);


That should keep the worker alive for a further 10 jiffies, hopefully
long enough for the next wait to occur. The cost is that it keeps the
interrupt asserted (and to avoid that requires a little rearrangment and
probably a dedicated kthread for each ring).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre