[Bug 105733] Amdgpu randomly hangs and only ssh works. Mouse cursor moves sometimes but does nothing. Keyboard stops working.

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Mar 25 04:47:54 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=105733

            Bug ID: 105733
           Summary: Amdgpu randomly hangs and only ssh works. Mouse cursor
                    moves sometimes but does nothing. Keyboard stops
                    working.
           Product: DRI
           Version: XOrg git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: critical
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: allan4229 at gmail.com

Created attachment 138344
  --> https://bugs.freedesktop.org/attachment.cgi?id=138344&action=edit
dmesg, killing pids, shutting down, unloading amdgpu, xorg log

WHAT HAPPENS
- Amdgpu hangs without any clear clue of what is happening.
- The mouse cursor responds to movements when the system is not frozen, but
also it does nothing as well.
- The keyboard gets num lock frozen and even trying with a ps2 one does not
work.
- The video gets frozen.
- Only ssh works, but only the times that the system is not frozen, of course.
- The most irritating part : the system can not be shutdown. No matter what you
do :
-- If you press the power button from the case, it is the only answer that you
can get from the output display : it shows a console indicating that x-server
is trying to be turned off. But nothing else happens and the system can't be
turned off.
-- If you try anything from ssh : "init 0", "poweroff", "shutdown -P 0 -h",
"reboot". It simply does not work. It keeps waiting for something that never
happens. Then you have to press ctrl_c to get back to the ssh sessioon. In an
attempt it closed the ssh daemon but the shutdown itself never happened... even
after 30mins.
-- It is IMPOSSIBLE to force unload amdgpu using "rmmod -f amdgpu". The task
takes forever and never responds. It only hangs the ssh session.
-- It is IMPOSSIBLE to kill some x-related pids properly. If you try to kill it
either nothing will happen or the process will be in a defunct state. Not even
a "su -c 'kill -9 <pid>'" will work.

TIPS
- The crashes that allows ssh connection almost always happens when firefox is
openned and running a video (netflix, youtube) or whatsapp web.
- The crashes that simply hangs the entire computer may occur at any time.

OBSERVATIONS
- I use a custom kernel (from 4.15). I've tried including the polaris binaries
for my card, that showed an improvement (less freeze states) for a while. But
now it is the same again.
- I use a nvidia io second pci-e slot for vfio. It is a must and I disable
nouveau as well... It shoud not be a reason for failing. I tried also with
another amd/none-card on second slot. The results were the same as I remember.

SYSTEM SPECS
- Custom kernel compilation optimized for ryzen
(https://wiki.gentoo.org/wiki/Ryzen) and using polaris binaries
(https://wiki.gentoo.org/wiki/AMDGPU)
- Chipset X370 (mobo)
- RX480 in first slot
- GTX 1070 on second slot.
- Tried also with a RX 580 on second slot.
- Tried also with nothing on second slot.
- i3wm loading from startx command

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180325/8f4a7555/attachment-0001.html>


More information about the dri-devel mailing list