<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Using Fan-Control causes mmhub-pagefault and unresponsive system on Navi"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=111528">111528</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Using Fan-Control causes mmhub-pagefault and unresponsive system on Navi
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DRI
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>not set
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>DRM/AMDgpu
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>dri-devel@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>saldorin@web.de
          </td>
        </tr></table>
      <p>
        <div>
        <pre>I first thought my issue was related to
<a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - AMD Navi GPU frequent freezes on both Manjaro/Ubuntu with kernel 5.3 and mesa 19.2 -git/llvm9"
   href="show_bug.cgi?id=111481">https://bugs.freedesktop.org/show_bug.cgi?id=111481</a> , but it seems it is a
different one.

When using any kind of fan-control software (i tried corectrl and
radeon-profile), after a while i get a strange "stutterting", as if the whole
OS halted for a few seconds, then continued for a few seconds...and the halted
times grew while the "usable seconds" got shorter quickly to the point of a
seemingly unresponsive system.
It's not just the GUI that is halted, but the whole system - i had rsync
running one time and the HDD is audible enough to hear that it was only active
during the seconds the GUI was responsive.

It doesn't happen regularly (seems like anything between 30min and 120min) and
i haven't yet made out a direct cause, but in journalctl, it seems the same
messages appear every time when it begins:

kernel: amdgpu: [powerplay] Failed to send message 0xf, response 0xfffffffb,
param 0xfd6000
kernel: amdgpu: [powerplay] Failed to send message 0xf, response 0xfffffffb,
param 0xfd6000
 kernel: amdgpu 0000:0f:00.0: [mmhub] VMC page fault (src_id:0 ring:169 vmid:0
pasid:0)
 kernel: amdgpu 0000:0f:00.0:   at page 0x0000600000fd6000 from 18
 kernel: amdgpu 0000:0f:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00041152

after that there are a lot of these:

kernel: amdgpu: [powerplay] Failed to send message 0x40, response 0xffffffc2
param 0x2
kernel: amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2,
param 0x80

with some other amdgpu-errors sprinkled in until shutdown/hardreset.

It doesn't occur without a fan-control software, so i'm pretty certain it is
somehow related to that.

System: 
Powercolor 5700xt Red Devil
3800x on X570 Taichi
Manjaro KDE
Manjaro 5.3rc6.d0826.ga55aa89-1
mesa-git 1:19.3.0_devel.114849.0142dcb990e-1
llvm-libs-git 10.0.0_r325376.70e158e09e9-1
And if it matters: firmware from
<a href="https://aur.archlinux.org/packages/linux-firmware-agd5f-radeon-navi10/">https://aur.archlinux.org/packages/linux-firmware-agd5f-radeon-navi10/</a>
v2019.08.26.14.36-1</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>