[Intel-gfx] [PATCH 15/32] drm/i915: Slaughter the thundering i915_wait_request herd
Chris Wilson
chris at chris-wilson.co.uk
Mon Dec 14 05:18:16 PST 2015
On Mon, Dec 14, 2015 at 12:21:32PM +0000, Tvrtko Ursulin wrote:
> >@@ -1229,18 +1219,13 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
> > s64 *timeout,
> > struct intel_rps_client *rps)
> > {
> >- struct intel_engine_cs *ring = i915_gem_request_get_ring(req);
> >- struct drm_device *dev = ring->dev;
> >- struct drm_i915_private *dev_priv = dev->dev_private;
> >- const bool irq_test_in_progress =
> >- ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
> > int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
> >- DEFINE_WAIT(wait);
> >- unsigned long timeout_expire;
> >+ struct intel_breadcrumb wait;
> >+ unsigned long timeout_remain;
> > s64 before, now;
> >- int ret;
> >+ int ret = 0;
> >
> >- WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
> >+ might_sleep();
>
> Why this suddenly?
No reason other than reading other patches and thought this would make
good annotation for this function.
> > /* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
> > for_each_ring(ring, dev_priv, i)
> >- wake_up_all(&ring->irq_queue);
> >+ intel_engine_wakeup(ring);
>
> This is from process context without a lock or synchronize rcu. I'll
> comment on it at the implementation site as well.
True. We could just do rcu_read_lock() here.
> >
> > /* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */
> > wake_up_all(&dev_priv->pending_flip_queue);
> >@@ -2986,16 +2985,17 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
> > if (ring_idle(ring, seqno)) {
> > ring->hangcheck.action = HANGCHECK_IDLE;
> >
> >- if (waitqueue_active(&ring->irq_queue)) {
> >+ if (intel_engine_has_waiter(ring)) {
> > /* Issue a wake-up to catch stuck h/w. */
> > if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) {
> >- if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring)))
> >+ if (!test_bit(ring->id, &dev_priv->gpu_error.test_irq_rings))
> > DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
> > ring->name);
> > else
> > DRM_INFO("Fake missed irq on %s\n",
> > ring->name);
> >- wake_up_all(&ring->irq_queue);
> >+
> >+ intel_engine_enable_fake_irq(ring);
> > }
> > /* Safeguard against driver failure */
> > ring->hangcheck.score += BUSY;
> >diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >new file mode 100644
> >index 000000000000..a9a199bca2c2
> >--- /dev/null
> >+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >@@ -0,0 +1,274 @@
> >+/*
> >+ * Copyright © 2015 Intel Corporation
> >+ *
> >+ * Permission is hereby granted, free of charge, to any person obtaining a
> >+ * copy of this software and associated documentation files (the "Software"),
> >+ * to deal in the Software without restriction, including without limitation
> >+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> >+ * and/or sell copies of the Software, and to permit persons to whom the
> >+ * Software is furnished to do so, subject to the following conditions:
> >+ *
> >+ * The above copyright notice and this permission notice (including the next
> >+ * paragraph) shall be included in all copies or substantial portions of the
> >+ * Software.
> >+ *
> >+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> >+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> >+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> >+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> >+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> >+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> >+ * IN THE SOFTWARE.
> >+ *
> >+ */
> >+
> >+#include "i915_drv.h"
> >+
> >+static void intel_breadcrumbs_fake_irq(unsigned long data)
> >+{
> >+ struct intel_breadcrumbs *b = (struct intel_breadcrumbs *)data;
> >+ struct task_struct *task;
> >+
> >+ /*
> >+ * The timer persists in case we cannot enable interrupts,
> >+ * or if we have previously seen seqno/interrupt incoherency
> >+ * ("missed interrupt" syndrome). Here the worker will wake up
> >+ * every jiffie in order to kick the oldest waiter to do the
> >+ * coherent seqno check.
> >+ */
> >+
> >+ task = READ_ONCE(b->first_waiter);
> >+ if (task) {
> >+ wake_up_process(task);
>
> Put a comment here describing why a task cannot exit and become
> invalid between sampling and wakeup?
>
> Or we could afford a spinlock here since this fires really infrequently?
This is softirq context, so means we have to bump the weight
of all our locks. I didn't want to do that, so...
> Or even intel_engine_wakeup?
Only because I was using intel_breadcrumbs here. I was thinking of
if (intel_engine_wakeup(engine))
mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
Ok, that looks better. I'll have to check up on softirq context vs rcu,
I think it is safe as the RCU grace period cannot expire, but I will
have to double check.
> >+static void irq_enable(struct intel_engine_cs *engine)
> >+{
> >+ WARN_ON(!engine->irq_get(engine));
> >+}
> >+
> >+static void irq_disable(struct intel_engine_cs *engine)
> >+{
> >+ engine->irq_put(engine);
> >+}
>
> These helpers don't do much and only have one call site each?
Later patches.
> >+static void __intel_breadcrumbs_disable_irq(struct intel_breadcrumbs *b)
> >+{
> >+ struct intel_engine_cs *engine =
> >+ container_of(b, struct intel_engine_cs, breadcrumbs);
> >+
> >+ if (!b->rpm_wakelock)
> >+ return;
> >+
> >+ if (b->irq_enabled) {
> >+ irq_disable(engine);
> >+ b->irq_enabled = false;
> >+ }
> >+
> >+ intel_runtime_pm_put(engine->i915);
> >+ b->rpm_wakelock = false;
> >+}
>
> Maybe put assert_spin_locked in the above two.
Ok.
> >+inline struct intel_breadcrumb *to_crumb(struct rb_node *node)
> >+{
> >+ return container_of(node, struct intel_breadcrumb, node);
> >+}
> >+
> >+bool intel_engine_add_breadcrumb(struct intel_engine_cs *engine,
> >+ struct intel_breadcrumb *wait)
> >+{
> >+ struct intel_breadcrumbs *b = &engine->breadcrumbs;
> >+ u32 seqno = engine->get_seqno(engine, true);
> >+ struct rb_node **p, *parent, *completed;
> >+ bool first;
> >+
> >+ spin_lock(&b->lock);
> >+
> >+ /* Insert the request into the retirment ordered list
> >+ * of waiters by walking the rbtree. If we are the oldest
> >+ * seqno in the tree (the first to be retired), then
> >+ * set ourselves as the bottom-half.
> >+ *
> >+ * As we descend the tree, prune completed branches since we hold the
> >+ * spinlock we know that the first_waiter must be delayed and can
> >+ * reduce some of the sequential wake up latency if we take action
> >+ * ourselves and wake up the copmleted tasks in parallel.
> >+ */
>
> Why it is interesting to do it both from add and remove breadcrumb
> paths? Wouldn't it be sufficient to do it only on remove?
Observation of heavily contended scenarios show processes getting
between in the interrupt and the bottom-half. So by seeing if we can
prune the tree, we may be able to advance the bottom-half quicker.
> >+ if (next && next != &wait->node) {
> >+ smp_store_mb(b->first_waiter, to_crumb(next)->task);
> >+ __intel_breadcrumbs_enable_irq(b);
> >+ wake_up_process(to_crumb(next)->task);
>
> I don't get this, why it is waking up the one after completed? Just
> to anticipate it might be completed soon?
No. Because we may miss the interrupt in the process of enabling it.
> >+ }
> >+
> >+ do {
> >+ struct intel_breadcrumb *crumb = to_crumb(completed);
> >+ completed = rb_prev(completed);
> >+
> >+ rb_erase(&crumb->node, &b->requests);
> >+ RB_CLEAR_NODE(&crumb->node);
> >+ wake_up_process(crumb->task);
> >+ } while (completed != NULL);
> >+ }
> >+
> >+ if (first)
> >+ smp_store_mb(b->first_waiter, wait->task);
> >+ BUG_ON(b->first_waiter == NULL);
>
> I object your honour! :) Let the user ssh in and reboot cleanly even
> if graphics stack is stuck.
You should still be able to ssh and kill the box. I object to using
WARN_ON inappropriately.
> >+static inline void intel_engine_wakeup(struct intel_engine_cs *engine)
> >+{
> >+ struct task_struct *task = READ_ONCE(engine->breadcrumbs.first_waiter);
> >+ if (task)
> >+ wake_up_process(task);
>
> And here definitely put a comment saying why this is safe without
> the spinlock.
>
> Actually seeing how it is called from irq context and process
> context I think it will need a lock.
Ok, we can look at improving the commentary.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list