[Intel-gfx] [PATCH v2] drm/i915: Hold rpm during GEM suspend in driver unload/suspend
Imre Deak
imre.deak at intel.com
Thu Mar 2 12:26:57 UTC 2017
On Thu, Mar 02, 2017 at 08:30:29AM +0000, Chris Wilson wrote:
> i915_gem_suspend() tries to access the device to ensure it is idle and
> all writes from the device are flushed to memory. It assumed is already
> held the runtime pm wakeref, but we should explicitly acquire it for our
> access to be safe.
>
> [ 619.926287] WARNING: CPU: 3 PID: 9353 at drivers/gpu/drm/i915/intel_drv.h:1750 gen6_write32+0x23e/0x2a0 [i915]
> [ 619.926300] RPM wakelock ref not held during HW access
> [ 619.926311] Modules linked in: vgem x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec coretemp snd_hwdep crct10dif_pclmul snd_hda_core crc32_pclmul snd_pcm mei_me mei lpc_ich ghash_clmulni_intel i915(-) sdhci_pci sdhci mmc_core e1000e ptp pps_core prime_numbers [last unloaded: snd_hda_intel]
> [ 619.926578] CPU: 3 PID: 9353 Comm: drv_module_relo Tainted: G U 4.10.0-CI-Trybot_609+ #1
> [ 619.926585] Hardware name: LENOVO 42962WU/42962WU, BIOS 8DET56WW (1.26 ) 12/01/2011
> [ 619.926592] Call Trace:
> [ 619.926609] dump_stack+0x67/0x92
> [ 619.926625] __warn+0xc6/0xe0
> [ 619.926640] warn_slowpath_fmt+0x4a/0x50
> [ 619.926726] gen6_write32+0x23e/0x2a0 [i915]
> [ 619.926801] gen6_mm_switch+0x38/0x70 [i915]
> [ 619.926871] i915_switch_context+0xec/0xa10 [i915]
> [ 619.926942] i915_gem_switch_to_kernel_context+0x13c/0x2b0 [i915]
> [ 619.927019] i915_gem_suspend+0x2b/0x180 [i915]
> [ 619.927079] i915_driver_unload+0x22/0x200 [i915]
> [ 619.927093] ? __this_cpu_preempt_check+0x13/0x20
> [ 619.927105] ? trace_hardirqs_on_caller+0xe7/0x200
> [ 619.927118] ? trace_hardirqs_on+0xd/0x10
> [ 619.927128] ? _raw_spin_unlock_irqrestore+0x3d/0x60
> [ 619.927192] i915_pci_remove+0x14/0x20 [i915]
> [ 619.927205] pci_device_remove+0x34/0xb0
> [ 619.927219] device_release_driver_internal+0x158/0x210
> [ 619.927234] driver_detach+0x3b/0x80
> [ 619.927245] bus_remove_driver+0x53/0xd0
> [ 619.927256] driver_unregister+0x27/0x50
> [ 619.927267] pci_unregister_driver+0x25/0xa0
> [ 619.927351] i915_exit+0x1a/0xb1a [i915]
> [ 619.927362] SyS_delete_module+0x193/0x1e0
> [ 619.927378] entry_SYSCALL_64_fastpath+0x1c/0xb1
> [ 619.927386] RIP: 0033:0x7f82b46c5d37
> [ 619.927393] RSP: 002b:00007ffdb6f610d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0
> [ 619.927408] RAX: ffffffffffffffda RBX: ffffffff81481ff3 RCX: 00007f82b46c5d37
> [ 619.927415] RDX: 0000000000000001 RSI: 0000000000000800 RDI: 000000000224f558
> [ 619.927422] RBP: ffffc90001187f88 R08: 0000000000000000 R09: 00007ffdb6f61100
> [ 619.927428] R10: 000000000224f4e0 R11: 0000000000000246 R12: 0000000000000000
> [ 619.927435] R13: 00007ffdb6f612b0 R14: 0000000000000000 R15: 0000000000000000
> [ 619.927451] ? __this_cpu_preempt_check+0x13/0x20
>
> or
>
> [ 641.646590] WARNING: CPU: 1 PID: 8913 at drivers/gpu/drm/i915/intel_drv.h:1750 intel_runtime_pm_get_noresume+0x8b/0x90 [i915]
> [ 641.646595] RPM wakelock ref not held during HW access
> [ 641.646600] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec snd_hwdep crct10dif_pclmul snd_hda_core crc32_pclmul ghash_clmulni_intel snd_pcm mei_me mei i915(-) r8169 mii prime_numbers i2c_hid [last unloaded: snd_hda_intel]
> [ 641.646825] CPU: 1 PID: 8913 Comm: drv_module_relo Tainted: G U 4.10.0-CI-Trybot_609+ #1
> [ 641.646836] Hardware name: TOSHIBA SATELLITE P50-C/06F4 , BIOS 1.20 10/08/2015
> [ 641.646843] Call Trace:
> [ 641.646857] dump_stack+0x67/0x92
> [ 641.646869] __warn+0xc6/0xe0
> [ 641.646880] warn_slowpath_fmt+0x4a/0x50
> [ 641.646893] ? __this_cpu_preempt_check+0x13/0x20
> [ 641.646904] ? trace_hardirqs_on_caller+0xe7/0x200
> [ 641.646957] intel_runtime_pm_get_noresume+0x8b/0x90 [i915]
> [ 641.647022] __i915_add_request+0x423/0x540 [i915]
> [ 641.647080] i915_gem_switch_to_kernel_context+0x148/0x2b0 [i915]
> [ 641.647145] i915_gem_suspend+0x2b/0x180 [i915]
> [ 641.647189] i915_driver_unload+0x22/0x200 [i915]
> [ 641.647200] ? __this_cpu_preempt_check+0x13/0x20
> [ 641.647210] ? trace_hardirqs_on_caller+0xe7/0x200
> [ 641.647220] ? trace_hardirqs_on+0xd/0x10
> [ 641.647231] ? _raw_spin_unlock_irqrestore+0x3d/0x60
> [ 641.647276] i915_pci_remove+0x14/0x20 [i915]
> [ 641.647293] pci_device_remove+0x34/0xb0
> [ 641.647307] device_release_driver_internal+0x158/0x210
> [ 641.647321] driver_detach+0x3b/0x80
> [ 641.647330] bus_remove_driver+0x53/0xd0
> [ 641.647338] driver_unregister+0x27/0x50
> [ 641.647348] pci_unregister_driver+0x25/0xa0
> [ 641.647415] i915_exit+0x1a/0xb1a [i915]
> [ 641.647429] SyS_delete_module+0x193/0x1e0
> [ 641.647444] entry_SYSCALL_64_fastpath+0x1c/0xb1
> [ 641.647453] RIP: 0033:0x7fc622bd2d37
> [ 641.647463] RSP: 002b:00007ffff8ffb5c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0
> [ 641.647475] RAX: ffffffffffffffda RBX: ffffffff81481ff3 RCX: 00007fc622bd2d37
> [ 641.647480] RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000000000d49118
> [ 641.647485] RBP: ffffc90000997f88 R08: 0000000000000000 R09: 00007ffff8ffb5f0
> [ 641.647491] R10: 0000000000d490a0 R11: 0000000000000246 R12: 0000000000000000
> [ 641.647498] R13: 00007ffff8ffb7a0 R14: 0000000000000000 R15: 0000000000000000
> [ 641.647510] ? __this_cpu_preempt_check+0x13/0x20
>
> v2: Keep holding rpm until the end to cover i915_gem_sanitize() as well.
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Fixes at least 1c777c5d1dc ("drm/i915/hsw: Fix GPU hang during resume from
S3-devices state")
Reviewed-by: Imre Deak <imre.deak at intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 51b690dc81df..26bea59d6ca8 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4186,6 +4186,7 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
> struct drm_device *dev = &dev_priv->drm;
> int ret;
>
> + intel_runtime_pm_get(dev_priv);
> intel_suspend_gt_powersave(dev_priv);
>
> mutex_lock(&dev->struct_mutex);
> @@ -4200,13 +4201,13 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
> */
> ret = i915_gem_switch_to_kernel_context(dev_priv);
> if (ret)
> - goto err;
> + goto err_unlock;
>
> ret = i915_gem_wait_for_idle(dev_priv,
> I915_WAIT_INTERRUPTIBLE |
> I915_WAIT_LOCKED);
> if (ret)
> - goto err;
> + goto err_unlock;
>
> i915_gem_retire_requests(dev_priv);
> GEM_BUG_ON(dev_priv->gt.active_requests);
> @@ -4252,11 +4253,12 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
> * machine in an unusable condition.
> */
> i915_gem_sanitize(dev_priv);
> + goto out_rpm_put;
>
> - return 0;
> -
> -err:
> +err_unlock:
> mutex_unlock(&dev->struct_mutex);
> +out_rpm_put:
> + intel_runtime_pm_put(dev_priv);
> return ret;
> }
>
> --
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
More information about the Intel-gfx
mailing list