radeon power management issues on a 12,2 iMac

Brad Campbell brad at fnarfbargle.com
Thu Sep 6 01:48:57 PDT 2012


I've posted this here simply as google bait for the next poor sod 
suffering with radeon power management issues.

Why does the in-kernel radeon driver try and cook my machine by default?

Since I bought this machine, I've run with a hard-coded hack to keep the 
card in low power mode. This has not previously had any issues, even to 
being able to watch three full screen HD streams (one on each head).

A recent change to the radeon driver left me with noise on the right 
hand side of all three heads, and a plea for assistance was me with a 
response that multiple heads are not supported in low-power mode (thus 
my hack in the first place).

The issue here is that all the multi-head profiles run the card in the 
highest power profile, and the machine ends up sounding like a 747 
trying to keep the card cool.

Poking into radeon_asic.c tells me my cards power profiles are setup in 
: r600_pm_init_profile

Time to have a look at what profiles the card actually has.
Enabling dri debugging and booting up leaves this in my log

Sep  6 13:47:37 localhost kernel: [    3.919380] 
[drm:radeon_pm_print_states], 5 Power State(s)
Sep  6 13:47:37 localhost kernel: [    3.919381] 
[drm:radeon_pm_print_states], State 0: Default
Sep  6 13:47:37 localhost kernel: [    3.919382] 
[drm:radeon_pm_print_states],  Default
Sep  6 13:47:37 localhost kernel: [    3.919383] 
[drm:radeon_pm_print_states],  16 PCIE Lanes
Sep  6 13:47:37 localhost kernel: [    3.919384] 
[drm:radeon_pm_print_states],  3 Clock Mode(s)
Sep  6 13:47:37 localhost kernel: [    3.919385] 
[drm:radeon_pm_print_states],          0 e: 680000     m: 900000 
v: 1100 No display only
Sep  6 13:47:37 localhost kernel: [    3.919386] 
[drm:radeon_pm_print_states],          1 e: 680000     m: 900000 
v: 1100
Sep  6 13:47:37 localhost kernel: [    3.919387] 
[drm:radeon_pm_print_states],          2 e: 680000     m: 900000 
v: 1100
Sep  6 13:47:37 localhost kernel: [    3.919388] 
[drm:radeon_pm_print_states], State 1: Performance
Sep  6 13:47:37 localhost kernel: [    3.919389] 
[drm:radeon_pm_print_states],  16 PCIE Lanes
Sep  6 13:47:37 localhost kernel: [    3.919390] 
[drm:radeon_pm_print_states],  3 Clock Mode(s)
Sep  6 13:47:37 localhost kernel: [    3.919391] 
[drm:radeon_pm_print_states],          0 e: 100000     m: 149000 
v: 900  No display only
Sep  6 13:47:37 localhost kernel: [    3.919392] 
[drm:radeon_pm_print_states],          1 e: 398000     m: 900000 
v: 1000
Sep  6 13:47:37 localhost kernel: [    3.919393] 
[drm:radeon_pm_print_states],          2 e: 680000     m: 900000 
v: 1100
Sep  6 13:47:37 localhost kernel: [    3.919395] 
[drm:radeon_pm_print_states], State 2: Default
Sep  6 13:47:37 localhost kernel: [    3.919395] 
[drm:radeon_pm_print_states],  16 PCIE Lanes
Sep  6 13:47:37 localhost kernel: [    3.919396] 
[drm:radeon_pm_print_states],  3 Clock Mode(s)
Sep  6 13:47:37 localhost kernel: [    3.919397] 
[drm:radeon_pm_print_states],          0 e: 298000     m: 900000 
v: 950  No display only
Sep  6 13:47:37 localhost kernel: [    3.919398] 
[drm:radeon_pm_print_states],          1 e: 298000     m: 900000 
v: 950
Sep  6 13:47:37 localhost kernel: [    3.919399] 
[drm:radeon_pm_print_states],          2 e: 680000     m: 900000 
v: 1100
Sep  6 13:47:37 localhost kernel: [    3.919400] 
[drm:radeon_pm_print_states], State 3: Default
Sep  6 13:47:37 localhost kernel: [    3.919401] 
[drm:radeon_pm_print_states],  16 PCIE Lanes
Sep  6 13:47:37 localhost kernel: [    3.919402] 
[drm:radeon_pm_print_states],  3 Clock Mode(s)
Sep  6 13:47:37 localhost kernel: [    3.919403] 
[drm:radeon_pm_print_states],          0 e: 502000     m: 900000 
v: 1050 No display only
Sep  6 13:47:37 localhost kernel: [    3.919404] 
[drm:radeon_pm_print_states],          1 e: 502000     m: 900000 
v: 1050
Sep  6 13:47:37 localhost kernel: [    3.919405] 
[drm:radeon_pm_print_states],          2 e: 680000     m: 900000 
v: 1100
Sep  6 13:47:37 localhost kernel: [    3.919406] 
[drm:radeon_pm_print_states], State 4: Battery
Sep  6 13:47:37 localhost kernel: [    3.919407] 
[drm:radeon_pm_print_states],  16 PCIE Lanes
Sep  6 13:47:37 localhost kernel: [    3.919408] 
[drm:radeon_pm_print_states],  3 Clock Mode(s)
Sep  6 13:47:37 localhost kernel: [    3.919409] 
[drm:radeon_pm_print_states],          0 e: 100000     m: 149000 
v: 900  No display only
Sep  6 13:47:37 localhost kernel: [    3.919410] 
[drm:radeon_pm_print_states],          1 e: 100000     m: 149000 
v: 900
Sep  6 13:47:37 localhost kernel: [    3.919411] 
[drm:radeon_pm_print_states],          2 e: 100000     m: 149000 
v: 900
Sep  6 13:47:37 localhost kernel: [    3.920559] [drm] radeon: power 
management initialized

So, because my GPU is a mobile device (rdev->flags & RADEON_IS_MOBILITY 
is true), then my single head profiles are selected from the Battery 
profile, and because there is not a second Battery profile the 
multi-head profiles come from profile 0 (default).

This card actually has 3 profiles called default, and interestingly the 
second one looks almost sane.

Hard coding the profile indexes works, however even with the low profile 
there (Profile 2, clock mode 0) the RAM is running flat out and it still 
generates some not insignificant heat.

So, ultimately I came up with the following hack :

diff -u temp/linux-3.4.4/drivers/gpu/drm/radeon/radeon_drv.c 
linux-3.4.4/drivers/gpu/drm/radeon/radeon_drv.c
--- temp/linux-3.4.4/drivers/gpu/drm/radeon/radeon_drv.c	2012-09-06 
15:27:18.337696944 +0800
+++ linux-3.4.4/drivers/gpu/drm/radeon/radeon_drv.c	2012-09-06 
16:06:05.252394033 +0800
@@ -137,6 +137,8 @@
  int radeon_pcie_gen2 = 0;
  int radeon_msi = -1;
  int radeon_lockup_timeout = 10000;
+int radeon_minsclk = 0;
+int radeon_minmclk = 0;

  MODULE_PARM_DESC(no_wb, "Disable AGP writeback for scratch registers");
  module_param_named(no_wb, radeon_no_wb, int, 0444);
@@ -189,6 +191,12 @@
  MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (defaul 
10000 = 10 seconds, 0 = disable)");
  module_param_named(lockup_timeout, radeon_lockup_timeout, int, 0444);

+MODULE_PARM_DESC(minsclk, "Minimum GPU clock speed");
+module_param_named(minsclk, radeon_minsclk, int, 0644);
+
+MODULE_PARM_DESC(minmclk, "Minimum Memory clock speed");
+module_param_named(minmclk, radeon_minmclk, int, 0644);
+
  static int radeon_suspend(struct drm_device *dev, pm_message_t state)
  {
  	drm_radeon_private_t *dev_priv = dev->dev_private;
diff -u temp/linux-3.4.4/drivers/gpu/drm/radeon/radeon.h 
linux-3.4.4/drivers/gpu/drm/radeon/radeon.h
--- temp/linux-3.4.4/drivers/gpu/drm/radeon/radeon.h	2012-09-06 
15:27:13.733803910 +0800
+++ linux-3.4.4/drivers/gpu/drm/radeon/radeon.h	2012-09-06 
15:45:06.678661305 +0800
@@ -95,6 +95,8 @@
  extern int radeon_pcie_gen2;
  extern int radeon_msi;
  extern int radeon_lockup_timeout;
+extern int radeon_minsclk;
+extern int radeon_minmclk;

  /*
   * Copy from radeon_drv.h so we don't have to include both and have 
conflicting
diff -u temp/linux-3.4.4/drivers/gpu/drm/radeon/radeon_pm.c 
linux-3.4.4/drivers/gpu/drm/radeon/radeon_pm.c
--- temp/linux-3.4.4/drivers/gpu/drm/radeon/radeon_pm.c	2012-09-06 
15:27:13.739803773 +0800
+++ linux-3.4.4/drivers/gpu/drm/radeon/radeon_pm.c	2012-09-06 
15:43:13.115341210 +0800
@@ -120,7 +120,7 @@
  		break;
  	case PM_PROFILE_LOW:
  		if (rdev->pm.active_crtc_count > 1)
-			rdev->pm.profile_index = PM_PROFILE_LOW_MH_IDX;
+			rdev->pm.profile_index = PM_PROFILE_LOW_SH_IDX;
  		else
  			rdev->pm.profile_index = PM_PROFILE_LOW_SH_IDX;
  		break;
@@ -193,7 +193,10 @@
  			clock_info[rdev->pm.requested_clock_mode_index].mclk;
  		if (mclk > rdev->pm.default_mclk)
  			mclk = rdev->pm.default_mclk;
-
+		if (mclk < radeon_minmclk)
+			mclk = radeon_minmclk;
+		if (sclk < radeon_minsclk)
+			sclk = radeon_minsclk;
  		/* upvolt before raising clocks, downvolt after lowering clocks */
  		if (sclk < rdev->pm.current_sclk)
  			misc_after = true;

I can now set a minimum clock speed for both the GPU and RAM and 
activate it by switching profiles.
Turning the GPU clock up to the same speed as the RAM in the lowest 
profile (150000) and running Clock mode 0 in Profile 5 sees all my 
visual artefacts go away, and I can resume using the machine without 
screaming fans.

Obviously the selection of correct default power profiles is a difficult 
issue and subject to the vagaries of the lunatic who wrote the cards 
BIOS. I don't pretend to have the answer, but I do have a hack that 
works for me (ugly as it may be).

I'm happy to work on a fix for this (it's not like I'm an isolated case 
here, a quick google search turns up plenty of hits) if someone can help 
me understand the right way to fix it properly.

Regards,
Brad


More information about the dri-devel mailing list