[Bug 105900] [CI] igt at gem_exec_* - fail - Failed assertion: !"GPU hung"
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Thu May 3 14:35:50 UTC 2018
https://bugs.freedesktop.org/show_bug.cgi?id=105900
--- Comment #7 from Chris Wilson <chris at chris-wilson.co.uk> ---
(In reply to Chris Wilson from comment #6)
> (In reply to Chris Wilson from comment #5)
> > (In reply to Martin Peres from comment #4)
> > > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_29/fi-cnl-y3/
> > > igt at gem_exec_await@wide-contexts.html
> > >
> > > (gem_exec_await:2291) igt_aux-CRITICAL: Test assertion failure function
> > > sig_abort, file ../lib/igt_aux.c:481:
> > > (gem_exec_await:2291) igt_aux-CRITICAL: Failed assertion: !"GPU hung"
> > > Subtest wide-contexts failed.
> >
> > This is a different issue. The GPU hang here is a result of hitting a
> > blocking ioctl in the test.
>
> Using Execlists submission
> Ring size: 143 batches
>
> If we can only fit 143 batches in a ring, why did we submit 144?...
Nah, last batch has seqno 144. Just an off-by-one (or at least misleading)
comment about the number of skipped batches. The ring that stuck was:
<7>[ 134.752727] hangcheck vecs0
<7>[ 134.752730] hangcheck current seqno c707, last c740, hangcheck c707
[4031 ms]
<7>[ 134.752733] hangcheck Reset count: 0 (global 0)
<7>[ 134.752736] hangcheck Requests:
<7>[ 134.752740] hangcheck first c708 [4e17:1] prio=0 @ 4036ms:
gem_exec_await[2291]/4
<7>[ 134.752744] hangcheck last c740 [4e17:39] prio=0 @ 4033ms:
gem_exec_await[2291]/4
<7>[ 134.752748] hangcheck active c708 [4e17:1] prio=0 @ 4036ms:
gem_exec_await[2291]/4
<7>[ 134.752752] hangcheck [head 0000, postfix 0030, tail 0050,
batch 0x00000000_00040000]
<7>[ 134.752755] hangcheck ring->start: 0x035ac000
<7>[ 134.752758] hangcheck ring->head: 0x00000000
<7>[ 134.752761] hangcheck ring->tail: 0x000011c8
<7>[ 134.752764] hangcheck ring->emit: 0x000011d0
<7>[ 134.752767] hangcheck ring->space: 0x00002df0
<7>[ 134.752772] hangcheck RING_START: 0x035ac000
<7>[ 134.752776] hangcheck RING_HEAD: 0x00000020
<7>[ 134.752780] hangcheck RING_TAIL: 0x000011c8
<7>[ 134.752786] hangcheck RING_CTL: 0x00003001
<7>[ 134.752791] hangcheck RING_MODE: 0x00000000
<7>[ 134.752795] hangcheck RING_IMR: fffffeff
<7>[ 134.752802] hangcheck ACTHD: 0x00000000_00040000
<7>[ 134.752809] hangcheck BBADDR: 0x00000000_00040001
<7>[ 134.752816] hangcheck DMA_FADDR: 0x00000000_00040200
<7>[ 134.752821] hangcheck IPEIR: 0x00000000
<7>[ 134.752825] hangcheck IPEHR: 0x18800101
<7>[ 134.752830] hangcheck Execlist status: 0x00044052 0000057f
<7>[ 134.752835] hangcheck Execlist CSB read 1 [1 cached], write 1 [1 from
hws], interrupt posted? no, tasklet queued? no (enabled)
<7>[ 134.752840] hangcheck ELSP[0] count=1, rq: c740 [4e17:39]
prio=0 @ 4033ms: gem_exec_await[2291]/4
<7>[ 134.752843] hangcheck ELSP[1] idle
<7>[ 134.752846] hangcheck HW active? 0x5
<7>[ 134.752850] hangcheck E c708 [4e17:1] prio=0 @ 4036ms:
gem_exec_await[2291]/4
<7>[ 134.752853] hangcheck E c709 [4e17:2] prio=0 @ 4036ms:
gem_exec_await[2291]/4
<7>[ 134.752857] hangcheck E c70a [4e17:3] prio=0 @ 4036ms:
gem_exec_await[2291]/4
<7>[ 134.752861] hangcheck E c70b [4e17:4] prio=0 @ 4036ms:
gem_exec_await[2291]/4
<7>[ 134.752865] hangcheck E c70c [4e17:5] prio=0 @ 4036ms:
gem_exec_await[2291]/4
<7>[ 134.752868] hangcheck E c70d [4e17:6] prio=0 @ 4036ms:
gem_exec_await[2291]/4
<7>[ 134.752872] hangcheck E c70e [4e17:7] prio=0 @ 4036ms:
gem_exec_await[2291]/4
<7>[ 134.752880] hangcheck ...skipping 49 executing requests...
<7>[ 134.752884] hangcheck E c740 [4e17:39] prio=0 @ 4033ms:
gem_exec_await[2291]/4
<7>[ 134.752887] hangcheck Queue priority: -2147483648
<7>[ 134.752890] hangcheck IRQ? 0x1 (breadcrumbs? yes) (execlists? no)
<7>[ 134.752893] hangcheck HWSP:
<7>[ 134.752897] hangcheck 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
<7>[ 134.752900] hangcheck *
<7>[ 134.752904] hangcheck 00000040 00008002 0000057f 00008002 0000057f
00008002 0000057f 00008002 0000057f
<7>[ 134.752909] hangcheck 00000060 00008002 0000057f 00008002 0000057f
00000000 00000000 00000000 00000000
<7>[ 134.752913] hangcheck 00000080 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
<7>[ 134.752917] hangcheck 000000a0 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000001
<7>[ 134.752922] hangcheck 000000c0 0000c707 00000000 00000000 00000000
00000000 00000000 00000000 00000000
<7>[ 134.752926] hangcheck 000000e0 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
<7>[ 134.752929] hangcheck *
<7>[ 134.752932] hangcheck Idle? no
which isn't out of ring space... Oh, maybe it is just the premature hangcheck,
but how? Hmm.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20180503/4473faf8/attachment.html>
More information about the intel-gfx-bugs
mailing list