<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - R290X stuck at 100% GPU load / full core clock on non-x86 machines"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=96964">96964</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>R290X stuck at 100% GPU load / full core clock on non-x86 machines
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DRI
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>XOrg git
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>Other
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>DRM/Radeon
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>dri-devel@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>kb9vqf@pearsoncomputing.net
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Our twin Radeon 290X cards are stuck at 100% GPU load (according to radeontop
and Gallium) and full core clock (according to radeon_pm_info) on non-x86
machines such as our POWER8 compute server.  The identical card does not show
this behaviour on a test x86 machine.

Forcibly crashing the GPU (causing a soft reset) fixes the issue.  Relevant
dmesg output starts at line 4 in this pastebin:
<a href="https://bugzilla.kernel.org/show_bug.cgi?id=70651">https://bugzilla.kernel.org/show_bug.cgi?id=70651</a>  It is unknown if simply
triggering a soft reset without the GPU crash would also resolve the issue.

I suspect this is related to the atombios x86-specific oprom code only
executing on x86 machines, and related setup therefore not being finalized by
the radeon driver itself on non-x86 machines.  However, this is just an
educated guess.

radeontop output of stuck card:
gpu 100.00%, ee 0.00%, vgt 0.00%, ta 0.00%, sx 0.00%, sh 0.00%, spi 0.00%, sc
0.00%, pa 0.00%, db 0.00%, cb 0.00%

radeontop output of "fixed" card after GPU crash / reset, running 3D app:
gpu 4.17%, ee 0.00%, vgt 0.00%, ta 3.33%, sx 3.33%, sh 0.00%, spi 3.33%, sc
3.33%, pa 0.00%, db 3.33%, cb 3.33%, vram 11.72% 479.87mb

Despite the "100% GPU load" indication, there is no sign of actual load being
placed on the GPU.  3D-intensive applications function 100% correctly with no
apparent performance degradation, so it seems the reading is a.) spurious and
b.) causing the core clock to throttle up needlessly.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>