[Bug 102393] [APL] Random GPU Random Hang Apollo Lake (ecode 9:1:0xeeffefa1) reason: Hang on bcs

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Feb 7 12:31:33 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=102393

Chris Wilson <chris at chris-wilson.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |RESOLVED
         Resolution|---                         |DUPLICATE

--- Comment #13 from Chris Wilson <chris at chris-wilson.co.uk> ---
commit ba74cb10c775c839f6e1d0fabd1e772eabd9c43f
Author: Michel Thierry <michel.thierry at intel.com>
Date:   Mon Nov 20 12:34:58 2017 +0000

    drm/i915/execlists: Delay writing to ELSP until HW has processed the
previous write

    The hardware needs some time to process the information received in the
    ExecList Submission Port, and expects us to not write anything more until
    it has 'acknowledged' this new submission by sending an IDLE_ACTIVE or
    PREEMPTED CSB event.

    If we do not follow this, the driver could write new data into the ELSP
    before HW had finishing fetching the previous one, putting us in
    'undefined behaviour' space.

    This seems to be the problem causing the spurious PREEMPTED & COMPLETE
    events after a COMPLETE like the one below:

    [] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3.
    [] vcs0:  Execlist CSB[0]: 0x00000018 _ 0x00000007
    [] vcs0:  Execlist CSB[1]: 0x00000001 _ 0x00000000
    [] vcs0:  Execlist CSB[2]: 0x00000018 _ 0x00000007  <<< COMPLETE
    [] vcs0:  Execlist CSB[3]: 0x00000012 _ 0x00000007  <<< PREEMPTED &
COMPLETE
    [] vcs0:  Execlist CSB[4]: 0x00008002 _ 0x00000006
    [] vcs0:  Execlist CSB[5]: 0x00000014 _ 0x00000006

    The ELSP writes that lead to this CSB sequence show that the HW hadn't
    started executing the previous execlist (the one with only ctx 0x6) by the
    time the new one was submitted; this is a bit more clear in the data
    show in the EXECLIST_STATUS register at the time of the ELSP write.

    [] vcs0: ELSP[0] = 0x0_0        [execlist1] - status_reg = 0x0_302
    [] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302

    [] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308
    [] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308

    Note that having to wait for this ack does not disable lite-restores,
    although it may reduce their numbers.

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035
    Signed-off-by: Michel Thierry <michel.thierry at intel.com>
    Link:
https://patchwork.freedesktop.org/patch/msgid/<20171118003038.7935-1-michel.thierry@intel.com>
    Link:
https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-4-chris@chris-wilson.co.uk
    Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
    Tested-by: Chris Wilson <chris at chris-wilson.co.uk>
    Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>

*** This bug has been marked as a duplicate of bug 102035 ***

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20180207/c1a86249/attachment.html>


More information about the intel-gfx-bugs mailing list