[Bug 150731] New: amdgpu: segfault on unbind in sysfs; card becomes nonresponsive

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Fri Jul 29 22:41:06 UTC 2016


https://bugzilla.kernel.org/show_bug.cgi?id=150731

            Bug ID: 150731
           Summary: amdgpu: segfault on unbind in sysfs; card becomes
                    nonresponsive
           Product: Drivers
           Version: 2.5
    Kernel Version: 4.6.4
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri at kernel-bugs.osdl.org
          Reporter: JimiJames.Bove at gmail.com
        Regression: No

Full details here:
https://www.reddit.com/r/linux_gaming/comments/4udupx/nvidiaamd_support_questions/d5ovipc

Summary:
I'm using an R9 380. Others confirmed having this issue on the R9 285 and RX
480 (so, Tonga & Polaris 10 at least).

I can bind my video card to amdgpu, and that works. It crashes X, but when I
log back in, it's properly connected and everything.

However, if I try to unbind it, after waiting for a few seconds, I get a
segfault. Any subsequent attempts to do anything with that card in
sysfs--trying to unbind again, trying to bind to something else, etc.--will get
stuck forever, never segfaulting, because the card is not responding.

Removing the card (echo 1 > /sys/bus/pci/devices/0000:0X:00.0/remove) works,
but after a rescan (echo 1 > /sys/bus/pci/rescan), the card is no longer in
sysfs at all, as if it's been powered down. It can't be accessed by the system
in any way after that, until the computer reboots.

It may or may not be related to the "reset issues" bug:
http://vfio.blogspot.de/2015/04/progress-on-amd-front.html
https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg03128.html
That bug officially only affects Hawaii and Bonaire, but Tonga cards (380, 285)
exhibit the same behavior even if it may not be for the same reason. Whether it
affects Polaris 10 (RX 480) is unknown. The RX 480 tester is currently finding
that out.

I also had this issue on 4.6.1, so it probably at least affects 4.6 in general.
Maybe all kernel versions that have amdgpu?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list