[Bug 93579] New: drm_intel_gem_bo_context_exec() failed: Input/output error

Mon Jan 4 09:41:30 PST 2016

https://bugs.freedesktop.org/show_bug.cgi?id=93579

            Bug ID: 93579
           Summary: drm_intel_gem_bo_context_exec() failed: Input/output
                    error
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: frank.dittrich at mailbox.org
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Created attachment 120800
  --> https://bugs.freedesktop.org/attachment.cgi?id=120800&action=edit
Contents of /sys/class/drm/card0/error

This is from dmesg:

[ 1791.185004] [drm] stuck on render ring
[ 1791.186261] [drm] GPU HANG: ecode 7:0:0x85ddfffc, in john [2277], reason:
Ring hung, action: reset
[ 1791.186265] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
stack, including userspace.
[ 1791.186268] [drm] Please file a _new_ bug report on bugs.freedesktop.org
against DRI -> DRM/Intel
[ 1791.186270] [drm] drm/i915 developers can then reassign to the right
component if it's not a kernel issue.
[ 1791.186273] [drm] The gpu crash dump is required to analyze gpu hangs, so
please always attach it.
[ 1791.186275] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 1791.188446] drm/i915: Resetting chip after gpu hang
[ 1797.181911] [drm] stuck on render ring
[ 1797.183147] [drm] GPU HANG: ecode 7:0:0x85ddfffc, in john [2277], reason:
Ring hung, action: reset
[ 1797.185338] drm/i915: Resetting chip after gpu hang

I am attaching the contents of /sys/class/drm/card0/error.

I got the error on a Fedora 22 system with kernel
4.4.0-0.rc6.git1.1.vanilla.knurd.1.fc22.x86_64.

The CPU is Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz.

I built and installed beignet from latest master commit.
(master)beignet $ git describe --tags 
Release_v1.0.0-654-gf749808

Then I built John the Ripper using a recent bleeding-jumbo commit
https://github.com/magnumripper/JohnTheRipper/commit/ca11872eaf094b0dbe90ba3f74fae5366d2b3125
(bleeding-jumbo)run $ git describe --tags 
1.8.0.6-jumbo-1-1814-gca11872

(bleeding-jumbo)run $ ./john --list=build-info 
Version: 1.8.0.6-jumbo-1-1814-gca11872
Build: linux-gnu 64-bit AVX2-ac OMP
SIMD: AVX2, interleaving: MD4:3 MD5:3 SHA1:1 SHA256:1 SHA512:1
$JOHN is ./
Format interface version: 13
Max. number of reported tunable costs: 3
Rec file version: REC4
Charset file version: CHR3
CHARSET_MIN: 1 (0x01)
CHARSET_MAX: 255 (0xff)
CHARSET_LENGTH: 24
SALT_HASH_SIZE: 1048576
Max. Markov mode level: 400
Max. Markov mode password length: 30
gcc version: 5.3.1
GNU libc version: 2.21 (loaded: 2.21)
OpenCL headers version: 1.2
Crypto library: OpenSSL
OpenSSL library version: 0100010bf
OpenSSL 1.0.1k-fips 8 Jan 2015
GMP library version: 6.0.0
Regex library version: 1.3    (loaded: 1.3.1)
File locking: fcntl()
fseek(): fseek
ftell(): ftell
fopen(): fopen
memmem(): System's

I got the GPU hang when running the self test for office2013-opencl format.

(bleeding-jumbo)run $ ./john --test=0 --format=office2013-opencl
Will run 4 OpenMP threads
Device 0: Intel(R) HD Graphics Haswell GT2 Desktop
Testing: office2013-opencl, MS Office 2013 (100,000 iterations) [SHA512 OpenCL
2x AES]... (4xOMP) Options used: -I ./kernels -cl-mad-enable -D__GPU__
-DDEVICE_INFO=34 -DSIZEOF_SIZE_T=8 -DDEV_VER_MAJOR=1 -DDEV_VER_MINOR=2
-D_OPENCL_COMPILER -DHASH_LOOPS=100 -DUNICODE_LENGTH=96 -DV_WIDTH=2
$JOHN/kernels/office2013_kernel.cl
Local worksize (LWS) 7, global worksize (GWS) 49
drm_intel_gem_bo_context_exec() failed: Input/output error
OpenCL CL_OUT_OF_RESOURCES error in opencl_office2013_fmt_plug.c:323 - failed
in clEnqueueNDRangeKernel
(bleeding-jumbo)run $ echo $?
1

I got similar problems with some older Linux kernels, beignet versions or John
the Ripper versions.

I have no idea what causes the GPU hang, but I doubt John the Ripper is to
blame.

John the Ripper's OpenCL formats work on various other GPUs.
Once I even tried that test after

# echo -n 0 > /sys/module/i915/parameters/enable_hangcheck

I had to switch off power after about half an hour.

Please let me know what else to test.

I would like to know what causes the GPU hang and how to fix or work around it.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20160104/acb7ce93/attachment.html>