[Bug 96964] R290X stuck at 100% GPU load / full core clock on non-x86 machines

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Jul 17 10:19:13 UTC 2016


https://bugs.freedesktop.org/show_bug.cgi?id=96964

            Bug ID: 96964
           Summary: R290X stuck at 100% GPU load / full core clock on
                    non-x86 machines
           Product: DRI
           Version: XOrg git
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Radeon
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: kb9vqf at pearsoncomputing.net

Our twin Radeon 290X cards are stuck at 100% GPU load (according to radeontop
and Gallium) and full core clock (according to radeon_pm_info) on non-x86
machines such as our POWER8 compute server.  The identical card does not show
this behaviour on a test x86 machine.

Forcibly crashing the GPU (causing a soft reset) fixes the issue.  Relevant
dmesg output starts at line 4 in this pastebin:
https://bugzilla.kernel.org/show_bug.cgi?id=70651  It is unknown if simply
triggering a soft reset without the GPU crash would also resolve the issue.

I suspect this is related to the atombios x86-specific oprom code only
executing on x86 machines, and related setup therefore not being finalized by
the radeon driver itself on non-x86 machines.  However, this is just an
educated guess.

radeontop output of stuck card:
gpu 100.00%, ee 0.00%, vgt 0.00%, ta 0.00%, sx 0.00%, sh 0.00%, spi 0.00%, sc
0.00%, pa 0.00%, db 0.00%, cb 0.00%

radeontop output of "fixed" card after GPU crash / reset, running 3D app:
gpu 4.17%, ee 0.00%, vgt 0.00%, ta 3.33%, sx 3.33%, sh 0.00%, spi 3.33%, sc
3.33%, pa 0.00%, db 3.33%, cb 3.33%, vram 11.72% 479.87mb

Despite the "100% GPU load" indication, there is no sign of actual load being
placed on the GPU.  3D-intensive applications function 100% correctly with no
apparent performance degradation, so it seems the reading is a.) spurious and
b.) causing the core clock to throttle up needlessly.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160717/e0902f30/attachment-0001.html>


More information about the dri-devel mailing list