[Intel-gfx] [PATCH] drm/i915: Prevent the system suspend complete optimization

Imre Deak imre.deak at intel.com
Tue Apr 11 16:12:35 UTC 2017


Since

commit bac2a909a096c9110525c18cbb8ce73c660d5f71
Author: Rafael J. Wysocki <rafael.j.wysocki at intel.com>
Date:   Wed Jan 21 02:17:42 2015 +0100

    PCI / PM: Avoid resuming PCI devices during system suspend

PCI devices will default to allowing the system suspend complete
optimization where devices are not woken up during system suspend if
they were already runtime suspended. This however breaks the i915/HDA
drivers for two reasons:

- The i915 driver has system suspend specific steps that it needs to
  run, that bring the device to a different state than its runtime
  suspended state.

- The HDA driver's suspend handler requires power that it will request
  from the i915 driver's power domain handler. This in turn requires the
  i915 driver to runtime resume itself, but this won't be possible if the
  suspend complete optimization is in effect: in this case the i915
  runtime PM is disabled and trying to get an RPM reference returns
  -EACCESS.

Solve this by getting/putting an RPM reference in the i915's
prepare/complete hooks, which in effect disables the suspend complete
optimization.

One possibility for this bug staying hidden for so long is that the
optimization for a device is disabled if it's disabled for any of its
children devices. i915 may have a backlight device as its children which
doesn't support runtime PM and so doesn't allow the optimization either.
So if this backlight device got registered the bug stayed hidden.

Credits to Marta, Tomi and David who enabled pstore logging,
that caught one instance of this issue across a suspend/
resume-to-ram and Ville who rememberd that the optimization was enabled
for some devices at one point.

The first WARN triggered by the problem:

[ 6250.746445] WARNING: CPU: 2 PID: 17384 at drivers/gpu/drm/i915/intel_runtime_pm.c:2846 intel_runtime_pm_get+0x6b/0xd0 [i915]
[ 6250.746448] pm_runtime_get_sync() failed: -13
[ 6250.746451] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core ptp mei_me pps_core snd_pcm lpc_ich mei prime_
numbers i2c_hid i2c_designware_platform i2c_designware_core [last unloaded: i915]
[ 6250.746512] CPU: 2 PID: 17384 Comm: kworker/u8:0 Tainted: G     U  W       4.11.0-rc5-CI-CI_DRM_334+ #1
[ 6250.746515] Hardware name:                  /NUC5i5RYB, BIOS RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
[ 6250.746521] Workqueue: events_unbound async_run_entry_fn
[ 6250.746525] Call Trace:
[ 6250.746530]  dump_stack+0x67/0x92
[ 6250.746536]  __warn+0xc6/0xe0
[ 6250.746542]  ? pci_restore_standard_config+0x40/0x40
[ 6250.746546]  warn_slowpath_fmt+0x46/0x50
[ 6250.746553]  ? __pm_runtime_resume+0x56/0x80
[ 6250.746584]  intel_runtime_pm_get+0x6b/0xd0 [i915]
[ 6250.746610]  intel_display_power_get+0x1b/0x40 [i915]
[ 6250.746646]  i915_audio_component_get_power+0x15/0x20 [i915]
[ 6250.746654]  snd_hdac_display_power+0xc8/0x110 [snd_hda_core]
[ 6250.746661]  azx_runtime_resume+0x218/0x280 [snd_hda_intel]
[ 6250.746667]  pci_pm_runtime_resume+0x76/0xa0
[ 6250.746672]  __rpm_callback+0xb4/0x1f0
[ 6250.746677]  ? pci_restore_standard_config+0x40/0x40
[ 6250.746682]  rpm_callback+0x1f/0x80
[ 6250.746686]  ? pci_restore_standard_config+0x40/0x40
[ 6250.746690]  rpm_resume+0x4ba/0x740
[ 6250.746698]  __pm_runtime_resume+0x49/0x80
[ 6250.746703]  pci_pm_suspend+0x57/0x140
[ 6250.746709]  dpm_run_callback+0x6f/0x330
[ 6250.746713]  ? pci_pm_freeze+0xe0/0xe0
[ 6250.746718]  __device_suspend+0xf9/0x370
[ 6250.746724]  ? dpm_watchdog_set+0x60/0x60
[ 6250.746730]  async_suspend+0x1a/0x90
[ 6250.746735]  async_run_entry_fn+0x34/0x160
[ 6250.746741]  process_one_work+0x1f2/0x6d0
[ 6250.746749]  worker_thread+0x49/0x4a0
[ 6250.746755]  kthread+0x107/0x140
[ 6250.746759]  ? process_one_work+0x6d0/0x6d0
[ 6250.746763]  ? kthread_create_on_node+0x40/0x40
[ 6250.746768]  ret_from_fork+0x2e/0x40
[ 6250.746778] ---[ end trace 102a62fd2160f5e6 ]---

Fixes: bac2a909a096 ("PCI / PM: Avoid resuming PCI devices during system suspend")
References: https://bugs.freedesktop.org/show_bug.cgi?id=100378
Cc: Rafael J. Wysocki <rafael.j.wysocki at intel.com>
Cc: Marta Lofstedt <marta.lofstedt at intel.com>
Cc: David Weinehall <david.weinehall at linux.intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela at intel.com>
Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala at intel.com>
Cc: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Takashi Iwai <tiwai at suse.de>
Cc: stable at vger.kernel.org
Signed-off-by: Imre Deak <imre.deak at intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index bd85e38..6185e28 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1862,6 +1862,31 @@ void i915_reset(struct drm_i915_private *dev_priv)
 	goto finish;
 }
 
+static int i915_pm_prepare(struct device *kdev)
+{
+	/*
+	 * Get a reference to disable the direct complete optimization. This
+	 * is needed, since we have a suspend sequence specific to system
+	 * suspend (that is different from runtime suspend) and because we
+	 * need to provide power to the sound driver while its system suspend
+	 * handler is running. This is not possible with the optimization in
+	 * effect, when the i915 runtime PM is disabled for the whole duration
+	 * of the suspend sequence if the device was already runtime
+	 * suspended at the beginning of the sequence. In this case the i915
+	 * suspend/resume hooks would be also skipped (besides its prepare and
+	 * complete hooks).
+	 */
+	intel_runtime_pm_get(kdev_to_i915(kdev));
+
+	return 0;
+}
+
+static void i915_pm_complete(struct device *kdev)
+{
+	/* Put the ref taken in the prepare step. */
+	intel_runtime_pm_put(kdev_to_i915(kdev));
+}
+
 static int i915_pm_suspend(struct device *kdev)
 {
 	struct pci_dev *pdev = to_pci_dev(kdev);
@@ -2490,6 +2515,13 @@ static int intel_runtime_resume(struct device *kdev)
 
 const struct dev_pm_ops i915_pm_ops = {
 	/*
+	 * Handlers called as the first and last step during the S0ix, S3 and
+	 * S4 power sequences.
+	 */
+	.prepare = i915_pm_prepare,
+	.complete = i915_pm_complete,
+
+	/*
 	 * S0ix (via system suspend) and S3 event handlers [PMSG_SUSPEND,
 	 * PMSG_RESUME]
 	 */
-- 
2.5.0



More information about the Intel-gfx mailing list