[Bug 111937] [CI][BAT] igt at i915_selftest@live_execlists - incomplete - GEM_BUG_ON(i915_active_is_idle(&ce->active))
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Wed Oct 9 14:32:04 UTC 2019
https://bugs.freedesktop.org/show_bug.cgi?id=111937
--- Comment #2 from Chris Wilson <chris at chris-wilson.co.uk> ---
This is bizarre; it looks quite straightforward but then unravels quickly as
you start pulling on threads.
<7> [696.700847] __intel_gt_set_wedged vcs0
<7> [696.700851] __intel_gt_set_wedged Awake? 1
<7> [696.700854] __intel_gt_set_wedged Hangcheck: 5864 ms ago
<7> [696.700856] __intel_gt_set_wedged Reset count: 0 (global 0)
<7> [696.700859] __intel_gt_set_wedged Requests:
<7> [696.702510] __intel_gt_set_wedged MMIO base: 0x001c0000
<7> [696.703362] __intel_gt_set_wedged RING_START: 0x0000a000
<7> [696.704157] __intel_gt_set_wedged RING_HEAD: 0x00002038
<7> [696.704184] __intel_gt_set_wedged RING_TAIL: 0x00002038
<7> [696.704223] __intel_gt_set_wedged RING_CTL: 0x00003401 [waiting]
<7> [696.705925] __intel_gt_set_wedged RING_MODE: 0x00000200 [idle]
<7> [696.706787] __intel_gt_set_wedged RING_IMR: 00000000
<7> [696.709285] __intel_gt_set_wedged ACTHD: 0x00000000_00002038
<7> [696.711045] __intel_gt_set_wedged BBADDR: 0x00000000_00000000
<7> [696.711915] __intel_gt_set_wedged DMA_FADDR: 0x00000000_0000c038
<7> [696.712812] __intel_gt_set_wedged IPEIR: 0x00000000
<7> [696.713579] __intel_gt_set_wedged IPEHR: 0x0e40c002
<7> [696.714443] __intel_gt_set_wedged Execlist status: 0x00002098 20000040,
entries 12
<7> [696.714446] __intel_gt_set_wedged Execlist CSB read 6, write 7, tasklet
queued? no (enabled)
<7> [696.714449] __intel_gt_set_wedged Execlist CSB[7]: 0x00000002, context:
536870944
<7> [696.714472] __intel_gt_set_wedged Active[0]:
ring:{start:00006000, hwsp:ffff9140, seqno:00000001}, rq: 1b146:2* prio=3 @
8240ms: [i915]
<7> [696.714487] __intel_gt_set_wedged Pending[0]
ring:{start:0000a000, hwsp:ffff9180, seqno:00000002}, rq: 1b147:2!+ prio=4097
@ 8240ms: signaled
<7> [696.714492] __intel_gt_set_wedged Pending[1]
ring:{start:00006000, hwsp:ffff9140, seqno:00000001}, rq: 1b146:4- prio=3 @
8240ms: [i915]
<7> [696.714509] __intel_gt_set_wedged E 1b146:2* prio=3 @ 8240ms:
[i915]
<7> [696.714512] __intel_gt_set_wedged E 1b146:4- prio=3 @ 8240ms:
[i915]
<7> [696.714515] __intel_gt_set_wedged Queue priority hint: 3
<0> [696.673408] i915_sel-5787 5.... 740573711us : __intel_context_do_pin:
vcs0 context:1b146 pin ring:{head:0000, tail:0000}
<0> [696.673408] i915_sel-5787 5.... 740574064us : __intel_context_do_pin:
vcs0 context:1b147 pin ring:{head:0000, tail:0000}
<0> [696.673408] i915_sel-5787 5.... 740574078us : __engine_unpark: vcs0
<0> [696.673408] i915_sel-5787 5.... 740574084us : __gt_unpark:
<0> [696.673408] i915_sel-5787 5.... 740574655us : __i915_request_commit:
vcs0 fence 1b146:2
<0> [696.673408] i915_sel-5787 5d..1 740574662us : __i915_request_submit:
vcs0 fence 1b146:2, current 0
<0> [696.673408] i915_sel-5787 5d..1 740574663us :
__execlists_submission_tasklet: vcs0: queue_priority_hint:-2147483648,
submit:yes
<0> [696.673408] i915_sel-5787 5d..1 740574665us : trace_ports: vcs0: submit
{ 1b146:2, 0:0 }
<0> [696.673408] i915_sel-5787 5.... 740574723us : __i915_request_commit:
vcs0 fence 1b147:2
<0> [696.673408] i915_sel-5787 5.... 740574754us : __i915_request_commit:
vcs0 fence 1b146:4
<0> [696.673408] <idle>-0 2..s1 740574757us : process_csb: vcs0 cs-irq
head=5, tail=6
<0> [696.673408] <idle>-0 2..s1 740574758us : process_csb: vcs0 csb[6]:
status=0x00000001:0x20000000
<0> [696.673408] <idle>-0 2..s1 740574760us : trace_ports: vcs0:
promote { 1b146:2*, 0:0 }
<0> [696.673408] <idle>-0 2d.s2 740574784us :
__execlists_submission_tasklet: vcs0: preempting last=1b146:2, prio=3,
hint=4097
<0> [696.673408] <idle>-0 2d.s2 740574786us : __i915_request_unsubmit:
vcs0 fence 1b146:2, current 1
<0> [696.673408] <idle>-0 2d.s2 740574788us : __i915_request_submit:
vcs0 fence 1b147:2, current 0
<0> [696.673408] <idle>-0 2d.s2 740574798us : __i915_request_submit:
vcs0 fence 1b146:2, current 1
<0> [696.673408] <idle>-0 2d.s2 740574800us : __i915_request_submit:
vcs0 fence 1b146:4, current 1
<0> [696.673408] <idle>-0 2d.s2 740574801us :
__execlists_submission_tasklet: vcs0: queue_priority_hint:-2147483648,
submit:yes
<0> [696.673408] <idle>-0 2d.s2 740574802us : trace_ports: vcs0: submit
{ 1b147:2, 1b146:4 }
<0> [696.673408] i915_sel-5787 5.... 740574910us : i915_request_retire: vcs0
fence 1b147:2, current 2
<0> [696.673408] i915_sel-5787 5.... 740574912us : intel_context_unpin: vcs0
context:1b147 retire
<0> [696.673408] i915_sel-5787 5.... 740574916us : __intel_context_retire:
vcs0 context:1b147 retire
So the HW froze, there is a CS event in the queue, but we never saw the
interrupt. (So the HW died? We just missed an interrupt? The latter is nice and
scary.)
During reset, the context idled. Which also shouldn't have happened -- I think
the engine parked, but we are after the set-wedged (and after the GEM_TRACE) so
it is retired immediately. Hmm. Seems possible.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20191009/eda81c21/attachment-0001.html>
More information about the intel-gfx-bugs
mailing list