drm/i915 GPU hang

Chris Wilson chris at chris-wilson.co.uk
Mon Jan 20 13:27:17 UTC 2020


Quoting Piper Fowler-Wright (2020-01-18 20:28:42)
> I have recently (since the New Year) been experiencing regular GPU hangs
> which typically render the system unusable. 
> 
> During the hangs the kernel buffer is filled with messages of the form
> 
> [ 8269.599926] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
> [ 8269.600022] i915 0000:00:02.0: Resetting chip for hang on rcs0
> [ 8269.601827] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
> [ 8269.602595] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
> [ 8277.705805] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
> 

Sadly it is known and the backport of the fix seems to have slipped
through the stable@ cracks.

It should be fixed in 5.5, which is in -rc7 already so should be usable.
On the other hand, if the problem reoccurs, we need to use drm-tip as a
known baseline for patching anyway.

> etc.
> 
> Most recently the following message was displayed
> 
> [12796.753277] i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
> [12796.753281] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
> [12796.753282] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
> [12796.753283] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
> [12796.753285] The GPU crash dump is required to analyze GPU hangs, so please always attach it.
> [12796.753286] GPU crash dump saved to /sys/class/drm/card0/error
> [12796.753304] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
> 
> Unfortunately, the /sys/class/drm/card0/error file contained only "No error
> state collected". 

It's only valid until the next reboot (since it's only kept in memory).
 
> bugs.freedesktop.org is no longer in operation so I decided to post here. Please
> redirect me to the correct list if this is one is not appropriate.

Fyi, the bug list is at gitlab.freedesktop.org/drm/intel now.
-Chris


More information about the dri-devel mailing list