[Bug 112157] [CI][SHARDS]igt at kms_frontbuffer_tracking@psr-modesetfrombusy - dmesg-warn - GEM_BUG_ON(((tail) & ~((__typeof__(tail))((64)-1))) == ((ring->head) & ~((__typeof__(ring->head))((64)-1))) && tail < ring->head)

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Oct 29 11:54:23 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=112157

Chris Wilson <chris at chris-wilson.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #4 from Chris Wilson <chris at chris-wilson.co.uk> ---
<0>[ 2016.271448]   <idle>-0       0..s1 2012175800us : process_csb: rcs0
cs-irq head=0, tail=1
<0>[ 2016.271485]   <idle>-0       0..s1 2012175800us : process_csb: rcs0
csb[1]: status=0x00000882:0x00000060
<0>[ 2016.271524]   <idle>-0       0..s1 2012175801us : trace_ports: rcs0:
preempted { 3d30:32, 0:0 }
<0>[ 2016.271561] gem_exec-8517    1d..1 2012175801us :
__execlists_submission_tasklet: vcs1: queue_priority_hint:-2147483648,
submit:yes
<0>[ 2016.271601]   <idle>-0       0..s1 2012175801us : process_csb:
reset_active(rcs0): { rq=3d30:32 }
<0>[ 2016.271639] gem_exec-8517    1d..1 2012175802us : trace_ports: vcs1:
submit { a:428, 0:0 }
<0>[ 2016.271683] gem_exec-8517    1.... 2012175811us : __i915_request_commit:
vecs0 fence c:608
<0>[ 2016.271721]   <idle>-0       0..s1 2012175811us : trace_ports: rcs0:
promote { 4:9442!, 0:0 }
<0>[ 2016.271765]   <idle>-0       0d.s2 2012175813us : __i915_request_submit:
rcs0 fence 3d30:12, current 10
<0>[ 2016.271810]   <idle>-0       0d.s2 2012175820us : __i915_request_submit:
rcs0 fence 3d30:14, current 10
<0>[ 2016.271855]   <idle>-0       0d.s2 2012175825us : __i915_request_submit:
rcs0 fence 3d30:16, current 10
<0>[ 2016.271900] gem_exec-8517    1d..1 2012175825us : __i915_request_submit:
vecs0 fence c:608, current 606
<0>[ 2016.271944]   <idle>-0       0d.s2 2012175829us : __i915_request_submit:
rcs0 fence 3d30:18, current 10
<0>[ 2016.271985] gem_exec-8517    1d..1 2012175831us :
__execlists_submission_tasklet: vecs0: queue_priority_hint:-2147483648,
submit:yes
<0>[ 2016.272027] gem_exec-8517    1d..1 2012175832us : trace_ports: vecs0:
submit { c:608, 0:0 }
<0>[ 2016.272070]   <idle>-0       0d.s2 2012175833us : __i915_request_submit:
rcs0 fence 3d30:20, current 10
<0>[ 2016.272115]   <idle>-0       0d.s2 2012175837us : __i915_request_submit:
rcs0 fence 3d30:22, current 10
<0>[ 2016.272159]   <idle>-0       0d.s2 2012175840us : __i915_request_submit:
rcs0 fence 3d30:24, current 10
<0>[ 2016.272203]   <idle>-0       0d.s2 2012175843us : __i915_request_submit:
rcs0 fence 3d30:26, current 10
<0>[ 2016.272247]   <idle>-0       0d.s2 2012175846us : __i915_request_submit:
rcs0 fence 3d30:28, current 10
<0>[ 2016.272292]   <idle>-0       0d.s2 2012175850us : __i915_request_submit:
rcs0 fence 3d30:30, current 10
<0>[ 2016.272331] gem_exec-8517    1.... 2012176048us : __intel_context_do_pin:
rcs0 context:3d2e pin ring:{head:0000, tail:0000}
<0>[ 2016.272370] gem_exec-8517    1.... 2012176053us : intel_context_unpin:
rcs0 context:3d2e retire
<0>[ 2016.272409]   <idle>-0       0d.s2 2012176132us :
assert_ring_tail_valid.part.38: assert_ring_tail_valid:101 GEM_BUG_ON(((tail) &
~((__typeof__(tail))((64)-1))) == ((ring->head) &
~((__typeof__(ring->head))((64)-1))) && tail < ring->head)

So it's the rogue i915.hangcheck=0 leading to context cancellation.

I believe fixed by

commit a7f328fc789817a6a0e5c46411956810d5ee00ca
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Mon Oct 28 12:41:25 2019 +0000

    drm/i915/execlists: Simply walk back along request timeline on reset

    The request's timeline will only contain requests from this context, in
    order of execution. Therefore, we can simply look back along this
    timeline to find the currently executing request.

    If we do find that the current context has completed its last request,
    that does not imply that all requests are completed in the context, so
    only advance the ring->head up to the end of the known completions!

    Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
    Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
    Link:
https://patchwork.freedesktop.org/patch/msgid/20191028124125.25176-1-chris@chris-wilson.co.uk

The dangerous part is that this is precipitated by hangcheck=0 and so will not
appear again in CI... That suggests I need a better smoketest for persistence
opt-out.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20191029/e4bd8b6f/attachment.html>


More information about the intel-gfx-bugs mailing list