Massive power regression going 3.4->3.5

Ben Widawsky ben at bwidawsk.net
Tue Aug 7 19:44:37 PDT 2012


On 2012-08-07 13:43, James Bottomley wrote:
> On Sun, 2012-08-05 at 22:36 +0200, Daniel Vetter wrote:
>> On Wed, Aug 01, 2012 at 11:08:19AM +0100, James Bottomley wrote:
>> > On Wed, 2012-08-01 at 09:58 +0100, Chris Wilson wrote:
>> > > On Wed, 01 Aug 2012 09:45:04 +0100, James Bottomley 
>> <James.Bottomley at HansenPartnership.com> wrote:
>> > > > On Wed, 2012-08-01 at 09:16 +0100, Chris Wilson wrote:
>> > > > > On Wed, 01 Aug 2012 09:06:12 +0100, James Bottomley 
>> <James.Bottomley at HansenPartnership.com> wrote:
>> > > > > > I got the attached to apply and it doesn't really improve 
>> the idle power
>> > > > > > much (12.5W).
>> > > > >
>> > > > > That's good to know. Next step is to try overriding 
>> i915.semaphores.
>> > > > > Can you please test with i915.semaphores=0 and 
>> i915.semaphores=1?
>> > > >
>> > > > There's not much point doing i915_semaphores=1 since that's 
>> the default
>> > > > on gen 6 hardware, but i915_semaphores=0 recovers and idle 
>> power of
>> > > > ~6.5W
>> > >
>> > > It is only the default if iommu is off, and changing the default
>> > > was one of the side-effects of the patch you bisected.
>> > >
>> > > Can you please login to the desktop, let it idle, record
>> > > /sys/kernel/debug/dri/0/i915_cur_delayinfo and 
>> .../i915_drpc_info.
>> > > Then trace-cmd record -e i915 sleep 10s, and follow up with a 
>> new pair
>> > > of /sys/kernel/debug/dri/0/i915_cur_delayinfo and 
>> .../i915_drpc_info.
>> > >
>> > > This will let us see whether the pm counters are truly advancing 
>> and
>> > > what activity the driver is performing whilst idle.
>> >
>> > OK, so here it is
>> >
>> > James
>>
>> Hm, if I haven't botched the math, you have a rc6 residency of about 
>> 320
>> seconds between the two cats of drpc_info. Can you please script 
>> this so
>> that we have exactly 10s in between? (Aside: 3.6 has a neat 
>> interface for
>> rc6 residency in sysfs ...)
>
> You botched the maths, I think.  The three cats after the sleep was
> three up arrows ... if it went over 11s I'd be surprised.
>
>> Also, you need to attach the output of trace-cmd report (like with 
>> perf),
>> so that we see the tracepoints in detail.
>
> You mean you want the full trace.dat file rather than what the output
> summary says?  I can, but it's 800k compressed which is probably over
> the list limit ... I can upload it somewhere when I get back from
> holiday next Monday.
>
>> Another quick thing to confirm: What is the power consumption on the 
>> old
>> kernel when booting with i915.i915_semaphores=1?
>
> It idles at around 13W, which means the history of the problem must 
> be
> this:
>
> What looks to have happened seems to be because of a merge failure in
> drm:
>
> In 3.2 Keith Packard disabled semaphores on sandybridge with
>
> commit ebbd857e6b9a92c0aff4aacd1b1d2361d888633e
> Author: Keith Packard <keithp at keithp.com>
> Date:   Mon Dec 26 17:02:10 2011 -0800
>
>     drm/i915: Disable semaphores by default on SNB
>
> Because of an apparent bug causing a GPU hang.
>
> I think this is what gave me the power savings in 3.4 when the PCI 
> layer
> was ready for it.
>
> It got re-enabled accidentally in 3.5 by a mismerge of
>
> commit 2911a35b2e4eb87ec48d03aeb11f019e51ae3c0d
> Author: Ben Widawsky <ben at bwidawsk.net>
> Date:   Thu Apr 5 14:47:36 2012 -0700
>
>     drm/i915: use semaphores for the display plane
>
> Because that puts back the pre 
> ebbd857e6b9a92c0aff4aacd1b1d2361d888633e
> semaphore enabling code, but in a different place, which is probably 
> why
> it wasn't spotted, so semaphores got re-enabled on sandybridge.
>
> Perhaps what we should be doing is verifying that semaphores aren't
> sucking the same 6W of power on ivybridge and if not, just re-disable
> them on sandybridge, since we'll have to do that anyway to re-apply 
> the
> bug fix.
>
> James

Hi James. Would you mind filing a bug on this? In trying to reproduce 
this issue, I ran into another similar, but different issue ie. not 
resolved with semaphores=0. The issue I see can be reproduced with 
intel-gpu-tools/tests/sysfs_rc6_residency. That test is basically the 
same thing as what Chris/Daniel was asking for earlier with the drpc 
debugfs file info. In any case, it would be good to centralize all the 
data we've collected somewhere other than a mailing list/attachments.

http://intellinuxgraphics.org/how_to_report_bug.html

P.S. sorry if you already filed a bug somewhere earlier in the thread. 
I've been having mail problems.


-- 
Ben Widawsky, Intel Open Source Technology Center


More information about the dri-devel mailing list