[Intel-gfx] [PATCH 13/19] drm/i915: Boost RPS frequency for CPU stalls

Wed Sep 11 00:53:22 CEST 2013

On Tue, Sep 10, 2013 at 07:36:42PM -0300, Rodrigo Vivi wrote:
> From: Chris Wilson <chris at chris-wilson.co.uk>
> 
> If we encounter a situation where the CPU blocks waiting for results
> from the GPU, give the GPU a kick to boost its the frequency.
> 
> This should work to reduce user interface stalls and to quickly promote
> mesa to high frequencies - but the cost is that our requested frequency
> stalls high (as we do not idle for long enough before rc6 to start
> reducing frequencies, nor are we aggressive at down clocking an
> underused GPU). However, this should be mitigated by rc6 itself powering
> off the GPU when idle, and that energy use is dependent upon the workload
> of the GPU in addition to its frequency (e.g. the math or sampler
> functions only consume power when used). Still, this is likely to
> adversely affect light workloads.
> 
> In particular, this nearly eliminates the highly noticeable wake-up lag
> in animations from idle. For example, expose or workspace transitions.
> (However, given the situation where we fail to downclock, our requested
> frequency is almost always the maximum, except for Baytrail where we
> manually downclock upon idling. This often masks the latency of
> upclocking after being idle, so animations are typically smooth - at the
> cost of increased power consumption.)
> 
> Stéphane raised the concern that this will punish good applications and
> reward bad applications - but due to the nature of how mesa performs its
> client throttling, I believe all mesa applications will be roughly
> equally affected. To address this concern, and to prevent applications
> like compositors from regularly boosting the RPS state, we only allow
> each client to receive one boost in each period of activity.
> 
> Unfortunately, this techinique is ineffective with Ironlake - which also
> has dynamic render power states and suffers just as dramatically. For
> Ironlake, the thermal/power headroom is shared with the CPU through
> Intelligent Power Sharing and the intel-ips module. This leaves us with
> no GPU boost frequencies available when coming out of idle, and due to
> hardware limitations we cannot change the arbitration between the CPU and
> GPU quickly enough to be effective.
> 
> v2: Limit each client to receiving a single boost for each active period.
> v3: Tidy up. Allow the device to always boost if it waits outside of
> client context.

There's a remaining issue that the compositor needs the boost
regularly as even though they never idle themselves (therefore under the
above scheme they are only allowed to gain one wait-boost), the GPU
sleeps waiting for flips. The latency from waking up from rc6 after a
flip is enough for the compositor to start dropping frames.

At the moment I'm running with

http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=perf&id=377a87649111b3fccf5ba488811706c3aeab89e1

on top (and v4 anyway).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre