[Bug 103479] [BAT] igt@* - dmesg-warn - *ERROR* Timeout waiting for engines to idle | *ERROR* vecs0 is not idle before parking

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Fri Nov 10 13:00:07 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=103479

--- Comment #14 from Chris Wilson <chris at chris-wilson.co.uk> ---
This should suppress (not fix) the spurious !idle errors.

commit 30b29406d9374989f34bce0eadaa630813049808
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Fri Nov 10 11:25:50 2017 +0000

    drm/i915: Restore the wait for idle engine after flushing interrupts

    So it appears that commit 5427f207852d ("drm/i915: Bump wait-times for
    the final CS interrupt before parking") was a little over optimistic in
    its belief that it had successfully waited for all residual activity on
    the engines before parking. Numerous sightings in CI since then of

    <7>[   52.542886] [IGT] core_auth: executing
    <3>[   52.561013] [drm:intel_engines_park [i915]] *ERROR* vcs0 is not idle
before parking
    <7>[   52.561215] intel_engines_park vcs0
    <7>[   52.561229] intel_engines_park    current seqno 98, last 98,
hangcheck 0 [-247449 ms], inflight 0
    <7>[   52.561238] intel_engines_park    Reset count: 0
    <7>[   52.561266] intel_engines_park    Requests:
    <7>[   52.561363] intel_engines_park    RING_START: 0x00000000 [0x00000000]
    <7>[   52.561377] intel_engines_park    RING_HEAD:  0x00000000 [0x00000000]
    <7>[   52.561390] intel_engines_park    RING_TAIL:  0x00000000 [0x00000000]
    <7>[   52.561406] intel_engines_park    RING_CTL:   0x00000000
    <7>[   52.561422] intel_engines_park    RING_MODE:  0x00000200 [idle]
    <7>[   52.561442] intel_engines_park    ACTHD:  0x00000000_00000000
    <7>[   52.561459] intel_engines_park    BBADDR: 0x00000000_00000000
    <7>[   52.561474] intel_engines_park    Execlist status: 0x00000301
00000000
    <7>[   52.561489] intel_engines_park    Execlist CSB read 5 [5 cached],
write 5 [5 from hws], interrupt posted? no
    <7>[   52.561500] intel_engines_park            ELSP[0] idle
    <7>[   52.561510] intel_engines_park            ELSP[1] idle
    <7>[   52.561519] intel_engines_park            HW active? 0x0
    <7>[   52.561608] intel_engines_park Idle? yes
    <7>[   52.561617] intel_engines_park

    on Braswell, which indicates that the engine just needs that little bit
    longer after flushing the tasklet to settle. So give it a few more
    milliseconds before declaring an err and applying the emergency brake.

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103479
    Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
    Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
    Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
    Link:
https://patchwork.freedesktop.org/patch/msgid/20171110112550.28909-1-chris@chris-wilson.co.uk
    Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>

Still don't know why it takes that little bit of time to settle (after already
having 200ms), but it should kill the false positives. If we get a hit after
another 10ms wait, that might show something interesting.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20171110/705e762b/attachment.html>


More information about the intel-gfx-bugs mailing list