[Bug 104545] New: kernel: [drm] GPU HANG: ecode 9:0:0xfffffffe, reason: Hang on rcs0, action: reset

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Jan 9 03:54:31 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=104545

            Bug ID: 104545
           Summary: kernel: [drm] GPU HANG: ecode 9:0:0xfffffffe, reason:
                    Hang on rcs0, action: reset
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: critical
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: reescf at gmail.com
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Created attachment 136624
  --> https://bugs.freedesktop.org/attachment.cgi?id=136624&action=edit
GPU crash dump from /sys/class/drm/card0/error as requested

I'm filing this as a new bug in accordance with instructions found in the
journal from the kernel following a GPU hang.

Symptoms: GPU hangs sometimes when running on battery. Hangs occur only when
the laptop is left awake without interaction for a little bit (e.g. while
making a cup of tea or popping to the loo). Hangs do not occur predictably,
however, and these conditions mostly result in no hang. When a hang does occur,
the machine is unresponsive on return and cannot be put to sleep or woken etc.
Screen is blank/black. Following hard reset, messages such as the following can
be found in the journal:

Ion 09 03:17:16 MyComputer kernel: [drm] GPU HANG: ecode 9:0:0xfffffffe,
reason: Hang on rcs0, action: reset
Ion 09 03:17:16 MyComputer kernel: [drm] GPU hangs can indicate a bug anywhere
in the entire gfx stack, including userspace.
Ion 09 03:17:16 MyComputer kernel: [drm] Please file a _new_ bug report on
bugs.freedesktop.org against DRI -> DRM/Intel
Ion 09 03:17:16 MyComputer kernel: [drm] drm/i915 developers can then reassign
to the right component if it's not a kernel issue.
Ion 09 03:17:16 MyComputer kernel: [drm] The gpu crash dump is required to
analyze gpu hangs, so please always attach it.
Ion 09 03:17:16 MyComputer kernel: [drm] GPU crash dump saved to
/sys/class/drm/card0/error
Ion 09 03:17:16 MyComputer kernel: i915 0000:00:02.0: Resetting rcs0 after gpu
hang

The last is repeated many times. As the crash dump does not typically survive a
reboot, a dump was collected using

dmesg -w | awk '/GPU crash dump saved to \/sys\/class\/drm\/card0\/error/
{system("cat /sys/class/drm/card0/error | bzip2 > error.bz2")}'

as suggested at https://bbs.archlinux.org/viewtopic.php?pid=1753566#p1753566.

Since the dump is not especially large, I'm attaching the decompressed version. 

I would be happy to provide further information on request, provided I can
figure out how to do whatever would be helpful.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20180109/7b09d551/attachment-0001.html>


More information about the intel-gfx-bugs mailing list