[Bug 112428] New: Unrecoverable GPU hang with 5.4.0 kernel

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Fri Nov 29 13:11:21 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=112428

            Bug ID: 112428
           Summary: Unrecoverable GPU hang with 5.4.0 kernel
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: not set
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: L.Bonnaud at laposte.net
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Hi,

I was using my system, doing nothing special, and the GPU hung.

There are many reports about GPU hangs but this one seems different:
 - it occurred with kernel 5.4.0 instead of 5.3.x kernels (my Intel GPU also
had many problems with 5.3.x kernels)
 - the GPU never recovered (which BTW caused some data loss).  I had to ssh
into the system to get debug info.

Here is some system info (full details below):

Kernel: Linux xeelee 5.4.0-050400-generic #201911242031 SMP Mon Nov 25 01:35:10
UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Distribution: Ubuntu 19.10

Machine: Intel NUC7i5BNB

Display connector: HDMI 2.0

[233850.738984] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[233850.739750] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request
timed out: {request: 00000001, RESET_CTL: 00000001}                             
[233850.739824] i915 0000:00:02.0: Resetting chip for hang on rcs0
[233850.741595] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request
timed out: {request: 00000001, RESET_CTL: 00000001}                             
[233850.742349] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request
timed out: {request: 00000001, RESET_CTL: 00000001}                             
[234291.141681] INFO: task kworker/0:0:5853 blocked for more than 120 seconds.
[234291.141690]       Not tainted 5.4.0-050400-generic #201911242031
[234291.141693] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[234291.141697] kworker/0:0     D    0  5853      2 0x80004000
[234291.141823] Workqueue: events i915_hotplug_work_func [i915]
[234291.141826] Call Trace:
[234291.141839]  __schedule+0x2e3/0x740
[234291.141846]  schedule+0x42/0xb0
[234291.141852]  schedule_preempt_disabled+0xe/0x10
[234291.141857]  __ww_mutex_lock.isra.0+0x261/0x7f0
[234291.141864]  __ww_mutex_lock_slowpath+0x16/0x20
[234291.141869]  ww_mutex_lock+0x38/0x90
[234291.141916]  drm_modeset_lock+0x35/0xb0 [drm]
[234291.142025]  intel_dp_retrain_link+0x94/0x1c0 [i915]
[234291.142122]  intel_ddi_hotplug+0x7a/0x350 [i915]
[234291.142130]  ? __switch_to_asm+0x40/0x70
[234291.142135]  ? __switch_to_asm+0x34/0x70
[234291.142140]  ? __switch_to_asm+0x40/0x70
[234291.142146]  ? __switch_to_asm+0x40/0x70
[234291.142238]  i915_hotplug_work_func+0x18b/0x280 [i915]
[234291.142249]  process_one_work+0x1ec/0x3a0
[234291.142256]  worker_thread+0x4d/0x400
[234291.142262]  kthread+0x104/0x140
[234291.142268]  ? process_one_work+0x3a0/0x3a0
[234291.142274]  ? kthread_park+0x90/0x90
[234291.142281]  ret_from_fork+0x35/0x40

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20191129/9c418981/attachment.html>


More information about the intel-gfx-bugs mailing list