[Bug 106560] [CI] igt at drv_selftest@live_hangcheck - igt_reset_engines failed with error

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Fri May 25 12:43:58 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=106560

Chris Wilson <chris at chris-wilson.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #2 from Chris Wilson <chris at chris-wilson.co.uk> ---
Hopefully,

commit 9a4dc80399b1630cea0f1ad8ef0417436cbb95d0
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Fri May 18 11:09:33 2018 +0100

    drm/i915: Flush the ring stop bit after clearing RING_HEAD in reset

    Inside the live_hangcheck (reset) selftests, we occasionally see
    failures like

    <7>[  239.094840] i915_gem_set_wedged rcs0
    <7>[  239.094843] i915_gem_set_wedged   current seqno 19a98, last 19a9a,
hangcheck 0 [5158 ms]
    <7>[  239.094846] i915_gem_set_wedged   Reset count: 6239 (global 1)
    <7>[  239.094848] i915_gem_set_wedged   Requests:
    <7>[  239.095052] i915_gem_set_wedged           first  19a99 [e8c:5f]
prio=1024 @ 5159ms: (null)
    <7>[  239.095056] i915_gem_set_wedged           last   19a9a [e81:1a]
prio=139 @ 5159ms: igt/rcs0[5977]/1
    <7>[  239.095059] i915_gem_set_wedged           active 19a99 [e8c:5f]
prio=1024 @ 5159ms: (null)
    <7>[  239.095062] i915_gem_set_wedged           [head 0220, postfix 0280,
tail 02a8, batch 0xffffffff_ffffffff]
    <7>[  239.100050] i915_gem_set_wedged           ring->start:  0x00283000
    <7>[  239.100053] i915_gem_set_wedged           ring->head:   0x000001f8
    <7>[  239.100055] i915_gem_set_wedged           ring->tail:   0x000002a8
    <7>[  239.100057] i915_gem_set_wedged           ring->emit:   0x000002a8
    <7>[  239.100059] i915_gem_set_wedged           ring->space:  0x00000f10
    <7>[  239.100085] i915_gem_set_wedged   RING_START: 0x00283000
    <7>[  239.100088] i915_gem_set_wedged   RING_HEAD:  0x00000260
    <7>[  239.100091] i915_gem_set_wedged   RING_TAIL:  0x000002a8
    <7>[  239.100094] i915_gem_set_wedged   RING_CTL:   0x00000001
    <7>[  239.100097] i915_gem_set_wedged   RING_MODE:  0x00000300 [idle]
    <7>[  239.100100] i915_gem_set_wedged   RING_IMR: fffffefe
    <7>[  239.100104] i915_gem_set_wedged   ACTHD:  0x00000000_0000609c
    <7>[  239.100108] i915_gem_set_wedged   BBADDR: 0x00000000_0000609d
    <7>[  239.100111] i915_gem_set_wedged   DMA_FADDR: 0x00000000_00283260
    <7>[  239.100114] i915_gem_set_wedged   IPEIR: 0x00000000
    <7>[  239.100117] i915_gem_set_wedged   IPEHR: 0x02800000
    <7>[  239.100120] i915_gem_set_wedged   Execlist status: 0x00044052
00000002
    <7>[  239.100124] i915_gem_set_wedged   Execlist CSB read 5 [5 cached],
write 5 [5 from hws], interrupt posted? no, tasklet queued? no (enabled)
    <7>[  239.100128] i915_gem_set_wedged           ELSP[0] count=1,
ring->start=00283000, rq: 19a99 [e8c:5f] prio=1024 @ 5164ms: (null)
    <7>[  239.100132] i915_gem_set_wedged           ELSP[1] count=1,
ring->start=00257000, rq: 19a9a [e81:1a] prio=139 @ 5164ms: igt/rcs0[5977]/1
    <7>[  239.100135] i915_gem_set_wedged           HW active? 0x5
    <7>[  239.100250] i915_gem_set_wedged           E 19a99 [e8c:5f] prio=1024
@ 5164ms: (null)
    <7>[  239.100338] i915_gem_set_wedged           E 19a9a [e81:1a] prio=139 @
5164ms: igt/rcs0[5977]/1
    <7>[  239.100340] i915_gem_set_wedged           Queue priority: 139
    <7>[  239.100343] i915_gem_set_wedged           Q 0 [e98:19] prio=132 @
5164ms: igt/rcs0[5977]/8
    <7>[  239.100346] i915_gem_set_wedged           Q 0 [e84:19] prio=121 @
5165ms: igt/rcs0[5977]/2
    <7>[  239.100349] i915_gem_set_wedged           Q 0 [e87:19] prio=82 @
5165ms: igt/rcs0[5977]/3
    <7>[  239.100352] i915_gem_set_wedged           Q 0 [e84:1a] prio=44 @
5164ms: igt/rcs0[5977]/2
    <7>[  239.100356] i915_gem_set_wedged           Q 0 [e8b:19] prio=20 @
5165ms: igt/rcs0[5977]/4
    <7>[  239.100362] i915_gem_set_wedged   drv_selftest [5894] waiting for
19a99

    where the GPU saw an arbitration point and idles; AND HAS NOT BEEN RESET!
    The RING_MODE indicates that is idle and has the STOP_RING bit set, so
    try clearing it.

    v2: Only clear the bit on restarting the ring, as we want to be sure the
    STOP_RING bit is kept if reset fails on wedging.
    v3: Spot when the ring state doesn't make sense when re-initialising the
    engine and dump it to the logs so that we don't have to wait for an
    error later and try to guess what happened earlier.
    v4: Prepare to print all the unexpected state, not just the first.

    Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
    Link:
https://patchwork.freedesktop.org/patch/msgid/20180518100933.2239-1-chris@chris-wilson.co.uk

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20180525/b58d5d71/attachment.html>


More information about the intel-gfx-bugs mailing list