[Bug 70334] New: [bisected]igt/module_reload causes system hang on queued branch
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Thu Oct 10 09:40:26 CEST 2013
https://bugs.freedesktop.org/show_bug.cgi?id=70334
Priority: high
Bug ID: 70334
CC: intel-gfx-bugs at lists.freedesktop.org
Assignee: intel-gfx-bugs at lists.freedesktop.org
Summary: [bisected]igt/module_reload causes system hang on
queued branch
QA Contact: intel-gfx-bugs at lists.freedesktop.org
Severity: critical
Classification: Unclassified
OS: Linux (All)
Reporter: xunx.fang at intel.com
Hardware: All
Status: NEW
Version: unspecified
Component: DRM/Intel
Product: DRI
Created attachment 87373
--> https://bugs.freedesktop.org/attachment.cgi?id=87373&action=edit
netconsole log
System Environment:
--------------------------
Platform: Ivybridge
Kernel: (drm-intel-nightly)c5ea23067eb3f0bc86dea95b8f544c7ef8dfea54
Bug detailed description:
-----------------------------
System hangs when unloading i915 module. It happens on all platforms on kernel
queued branch. Bisect shows b29c19b645287f7062e17d70fa4e9781a01a5d88 is the
first bad commit.
commit b29c19b645287f7062e17d70fa4e9781a01a5d88
Author: Chris Wilson <chris at chris-wilson.co.uk>
AuthorDate: Wed Sep 25 17:34:56 2013 +0100
Commit: Daniel Vetter <daniel.vetter at ffwll.ch>
CommitDate: Thu Oct 3 20:01:31 2013 +0200
drm/i915: Boost RPS frequency for CPU stalls
If we encounter a situation where the CPU blocks waiting for results
from the GPU, give the GPU a kick to boost its the frequency.
This should work to reduce user interface stalls and to quickly promote
mesa to high frequencies - but the cost is that our requested frequency
stalls high (as we do not idle for long enough before rc6 to start
reducing frequencies, nor are we aggressive at down clocking an
underused GPU). However, this should be mitigated by rc6 itself powering
off the GPU when idle, and that energy use is dependent upon the workload
of the GPU in addition to its frequency (e.g. the math or sampler
functions only consume power when used). Still, this is likely to
adversely affect light workloads.
In particular, this nearly eliminates the highly noticeable wake-up lag
in animations from idle. For example, expose or workspace transitions.
(However, given the situation where we fail to downclock, our requested
frequency is almost always the maximum, except for Baytrail where we
manually downclock upon idling. This often masks the latency of
upclocking after being idle, so animations are typically smooth - at the
cost of increased power consumption.)
Stéphane raised the concern that this will punish good applications and
reward bad applications - but due to the nature of how mesa performs its
client throttling, I believe all mesa applications will be roughly
equally affected. To address this concern, and to prevent applications
like compositors from permanently boosting the RPS state, we ratelimit the
frequency of the wait-boosts each client recieves.
Unfortunately, this techinique is ineffective with Ironlake - which also
has dynamic render power states and suffers just as dramatically. For
Ironlake, the thermal/power headroom is shared with the CPU through
Intelligent Power Sharing and the intel-ips module. This leaves us with
no GPU boost frequencies available when coming out of idle, and due to
hardware limitations we cannot change the arbitration between the CPU and
GPU quickly enough to be effective.
v2: Limit each client to receiving a single boost for each active period.
Tested by QA to only marginally increase power, and to demonstrably
increase throughput in games. No latency measurements yet.
v3: Cater for front-buffer rendering with manual throttling.
v4: Tidy up.
v5: Sadly the compositor needs frequent boosts as it may never idle, but
due to its picking mechanism (using ReadPixels) may require frequent
waits. Those waits, along with the waits for the vrefresh swap, conspire
to keep the GPU at low frequencies despite the interactive latency. To
overcome this we ditch the one-boost-per-active-period and just ratelimit
the number of wait-boosts each client can receive.
Reproduce steps:
----------------------------
1. ./module_reload
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20131010/a21ba011/attachment.html>
More information about the intel-gfx-bugs
mailing list