[Bug 75701] New: Radeon: GPU recovery is unable to recover from GPU lockups (HD5770 - OpenCL example).
bugzilla-daemon at bugzilla.kernel.org
bugzilla-daemon at bugzilla.kernel.org
Wed May 7 18:46:51 PDT 2014
https://bugzilla.kernel.org/show_bug.cgi?id=75701
Bug ID: 75701
Summary: Radeon: GPU recovery is unable to recover from GPU
lockups (HD5770 - OpenCL example).
Product: Drivers
Version: 2.5
Kernel Version: 3.15-rc4
Hardware: x86-64
OS: Linux
Tree: Mainline
Status: NEW
Severity: high
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri at kernel-bugs.osdl.org
Reporter: t3st3r at mail.ru
Regression: No
Created attachment 135351
--> https://bugzilla.kernel.org/attachment.cgi?id=135351&action=edit
Unsuccessful GPU recovery attempt - kernel log
There are some cases when Radeon GPUs can lock up on some MESA errors and so
on. While it MESA bugs and somesuch, there is what I believe to be kernel side
bug as well.
Kernel side problem is how kernel handles GPU recovery procedure. Right now GPU
recovery would fail most of time on virtually any MESA bug and any GPUm, system
would be left in completely unusable state due to lack of graphic output.
Couple of recent examples would be filed for 2 GPU families.
*This* bug is for GPU deadlock on HD5770 (Evergreen - JUNIPER) on bugged MESA
OpenCL operations.
To reproduce:
1) Install Ubuntu 14.04.
2) Add "oibaf PPA" to get recent MESA-based drivers.
3) Update GPU drivers from Oibaf PPA.
4) Install mesa-opencl-icd library for OpenCL (icd based) support.
5) Boot with 3.15-rc4 kernel (can be self-compiled or taken from kernel PPA,
does not affects bug).
6) Get "Clpeak" tool (https://github.com/krrishnarraj/clpeak.git) and build it
(OpenCL VRAM benchmark tool).
7) Try to run it.
8) Program will do some benchmark. Then GPU would lock up.
9) Then kernel part would try recovery. It would fail all the time.
Result:
GPU locks up. Recovery fails. System left in unusable state due to lack of
graphic output.
Expected:
More or less sane GPU recovery. Some data could be lost, picture can be
distorted, some opencl/opengl calls can return errors, some programs can crash.
But leaving GPU in faulity state and trying to restore the very same faulty
state (without success obviously) isn't a option. What happens now is
absolultely worst GPU recovery at all as it leaves system in unusable state
with GPU which can't be brought back without reboot (there is no screen output
at this point).
--
You are receiving this mail because:
You are watching the assignee of the bug.
More information about the dri-devel
mailing list