[Bug 91880] Radeonsi on Grenada cards (r9 390) exceptionally unstable and poorly performing

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Sep 13 22:41:15 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=91880

--- Comment #172 from Jon Doane <jrdoane at gmail.com> ---
(In reply to Thomas DEBESSE from comment #171)
> > This sounds a lot like what I've been doing manually which sounds nice.
> > Thanks for the input. I honestly would like a solution that doesn't cause my
> > machine to draw an additional 90 watts at idle though.
> 
> Unlike the kernel patch above, that systemd service is setting the GPU to
> "low battery" by default, which is the most energy saving profile. The
> provided `dpm query` tool allows you to change that at any time. That's what
> I'm doing: at init, my GPU is set to "low battery" profile, and when I need
> to do some heavy time, I do that:
> 
> dpm-query set all high performance
> 
> And then once the heavy task is done, I do that to save energy again:
> 
> dpm-query set all low battery
> 
> With the default config for the service, you just have to add your own user
> to the "video" group to have the right to change the profile as user.
> 
> So, even if the patch above get merged one day, this service and tool is
> still useful, it's an easy way to change the default profile, whatever the
> default is.
> 
> Notice that the kernel patch above only set the level to "high", but keep
> the state to "balanced", so it's still adaptative. What "high balanced" does
> is setting the shader and memory frequencies to the max, which is drawing
> more power than default, but you will notice the fan are still idling and
> stopped if you do nothing because it's still saving a lot of energy. If you
> set "high performance" the fan will almost instantaneously start because
> there is no saving anymore. So "high balanced" is less energy saving that
> "auto balanced", but is still saving a lot of energy because it does not
> have to cold the chip while doing nothing (meaning the chip does nothing
> strong enough to get hot).

Unless something has changed with how the dpm state is handled, I don't expect
that to make the system completely stable. They're more stable than balanced
but, it's not stable enough to prevent a crash. I tried by starting off with:
echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level

The only method that I've had luck with while retaining clock scaling is this:
echo 234567 > /sys/class/drm/card0/device/pp_dpm_sclk

This disables the 300Mhz clock step which seems to work however, I've observed
that doing this also forced memory clocks to full tilt instead of idle so, I'm
uncertain if the memory clock or core clock is responsible.

Something I've observed is that if my machine crashes and I use the reset
button to restart it, that when X loads and if I don't force clocks up, it
always crashes and that part of the old image that was on the screen when it
initially crashed gets displayed, albeit rather garbled but, enough to identify
it which makes me think that it's related to the memory clock or how GPU memory
is managed.

One way or another, I have ways around the problem but, these are hacks that
would be considered intolerable solutions by a regular user.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20170913/b9d23dd2/attachment-0001.html>


More information about the dri-devel mailing list