Hard lockups with ROCM

Daniel Kasak d.j.kasak.dk at gmail.com
Thu May 16 00:33:03 UTC 2019


On Mon, May 13, 2019 at 11:44 AM Daniel Kasak <d.j.kasak.dk at gmail.com>
wrote:

> Hi all. I had version 2.2.0 of the ROCM stack running on a 5.0.x and 5.1.0
> kernel. Things were going great with various boinc GPU tasks. But there is
> a setiathome GPU task which reliably gives me a hard lockup within about 30
> minutes of running. I actually had to do *two* emergency re-installs over
> the past week. Perhaps part of this was my fault ( running btrfs with lzo
> compression on my root partition ... ). But absolutely part of this was the
> hard lockups. I've tested all kinds of other things ( eg rebuilding lots of
> stuff under Gentoo ) ... I don't have a general stability issue even under
> hours of high load. But after restarting boinc with that same setiathome
> task ... <bang>!
>
> If someone wants me to sacrifice another installation, they can point me
> to instructions for trying to gather more information.
>
> Anyway ... perhaps more work around detecting and recovering from GPU
> lockups is in order?
>
> Dan
>

<sigh>

That's what I was afraid of :(
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190516/a347c752/attachment.html>


More information about the amd-gfx mailing list