[Bug 89001] [SKL]Time out and system reboot fails while running IGT cases: gem_ringfill/render, gem_ringfill/render-interruptible

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Mar 10 13:23:52 PDT 2015


https://bugs.freedesktop.org/show_bug.cgi?id=89001

Jesse Barnes <jbarnes at virtuousgeek.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |michel.thierry at intel.com

--- Comment #1 from Jesse Barnes <jbarnes at virtuousgeek.org> ---
Michel, have you seen this one?  It's hard to capture logs since the system
hangs pretty hard, but I saw one that was a bad io access in the iowrite32 in
intel_logical_ring_emit() which sent me searching for our virtual_start mapping
setup.  That led me to something like this:

diff --git a/drivers/gpu/drm/i915/intel_lrc.c
b/drivers/gpu/drm/i915/intel_lrc.c
index fcb074b..bc97457 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -504,8 +504,11 @@ static int execlists_context_queue(struct intel_engine_cs
*
        unsigned long flags;
        int num_elements = 0;

-       if (to != ring->default_context)
-               intel_lr_context_pin(ring, to);
+       if (to != ring->default_context) {
+               ret = intel_lr_context_pin(ring, to);
+               if (ret)
+                       return ret;
+       }

        if (!request) {
                /*
@@ -802,13 +805,16 @@ intel_logical_ring_advance_and_submit(struct
intel_ringbuf
                                      struct drm_i915_gem_request *request)
 {
        struct intel_engine_cs *ring = ringbuf->ring;
+       int ret;

        intel_logical_ring_advance(ringbuf);

        if (intel_ring_stopped(ring))
                return;

-       execlists_context_queue(ring, ctx, ringbuf->tail, request);
+       ret = execlists_context_queue(ring, ctx, ringbuf->tail, request);
+       if (ret)
+               DRM_ERROR("execlist context queue failed: %d\n", ret);
 }

 static int intel_lr_context_pin(struct intel_engine_cs *ring,

but that's not sufficient to fix this bug.  It does seem important that we
check these return values though.

And this failure may indicate something wrong with the lrc handling code, I'm
not sure.  Some additional, custom kernel debug code would probably help narrow
things down.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20150310/d77dc344/attachment.html>


More information about the intel-gfx-bugs mailing list