Hard lockups with ROCM
d.j.kasak.dk at gmail.com
Mon May 13 01:44:11 UTC 2019
Hi all. I had version 2.2.0 of the ROCM stack running on a 5.0.x and 5.1.0
kernel. Things were going great with various boinc GPU tasks. But there is
a setiathome GPU task which reliably gives me a hard lockup within about 30
minutes of running. I actually had to do *two* emergency re-installs over
the past week. Perhaps part of this was my fault ( running btrfs with lzo
compression on my root partition ... ). But absolutely part of this was the
hard lockups. I've tested all kinds of other things ( eg rebuilding lots of
stuff under Gentoo ) ... I don't have a general stability issue even under
hours of high load. But after restarting boinc with that same setiathome
task ... <bang>!
If someone wants me to sacrifice another installation, they can point me to
instructions for trying to gather more information.
Anyway ... perhaps more work around detecting and recovering from GPU
lockups is in order?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the amd-gfx