<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [drm] GPU recovery disabled."
   href="https://bugs.freedesktop.org/show_bug.cgi?id=107154">107154</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[drm] GPU recovery disabled.
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DRI
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>DRM/AMDgpu
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>dri-devel@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>freedesktop.org@nentwig.biz
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Hi!

This is a surprisingly long standing problem with a RX 460, more precisely
since 4.15 all the way up to 4.18 AMD staging DRM next [1]. 
After resuming from sleep (echo -n mem > /sys/power/state) amdgpu is dead
(always, reliably).
Here's what dmesg has to say about it:

[Sun Jul  8 11:01:17 2018] PM: suspend exit
[Sun Jul  8 11:01:19 2018] [drm:gfx_v8_0_ring_test_ib [amdgpu]] *ERROR* amdgpu:
IB test timed out.
[Sun Jul  8 11:01:19 2018] [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu:
failed testing IB on GFX ring (-110).
[Sun Jul  8 11:01:19 2018] [drm:process_one_work] *ERROR* ib ring test failed
(-110).
[Sun Jul  8 11:01:28 2018] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, last signaled seq=864, last emitted seq=868
[Sun Jul  8 11:01:28 2018] [drm] GPU recovery disabled.

>From ealier versions:

[   42.802559] PM: suspend exit
[   42.824332] amdgpu 0000:41:00.0: GPU fault detected: 147 0x0bd84802
[   42.824338] amdgpu 0000:41:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x0034F97B
[   42.824341] amdgpu 0000:41:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0C048002
[   42.824345] amdgpu 0000:41:00.0: VM fault (0x02, vmid 6) at page 3471739,
read from 'TC0' (0x54433000) (72)
[   52.956306] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
last signaled seq=1287, last emitted seq=1289
[   52.956316] [drm] IP block:gfx_v8_0 is hung!
[   52.956362] [drm] GPU recovery disabled.

I've also seen fault 146 but other than that it mostly looks the same. 4.14-lts
(with dc=0) works fine.

RX 460, Zenith Extreme, 1950x.

[1] arch linux AUR; this versioning is a bit confusing, it may actually already
be the 4.19 branch, latest commit is3838e387fd1eb17bfcf6ff7d443d931adb5cb41b</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>