[Intel-gfx] [BUG, bisect] Re: drm/i915: WARN_ON(dev_priv->mm.busy)
Jeremiah Mahler
jmmahler at gmail.com
Sun Jun 7 22:56:41 PDT 2015
all,
On Sat, Jun 06, 2015 at 08:09:34PM -0700, Jeremiah Mahler wrote:
> all,
>
> On all my machines with Intel graphics I get the following warning
> in the logs when the machine is suspended. Apparently some part of
> the graphics system is busy when it should be idle. This is present
> on the latest linux-next 20150604.
>
> ...
> [ 33.141747] Suspending console(s) (use no_console_suspend to debug)
> [ 33.142146] wlan0: deauthenticating from 00:1a:70:5a:6e:0b by local
> choice (Reason: 3=DEAUTH_LEAVING)
> [ 33.147395] queueing ieee80211 work while going to suspend
> [ 33.151597] cfg80211: Calling CRDA to update world regulatory domain
> [ 33.190430] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> [ 33.190523] sd 0:0:0:0: [sda] Stopping disk
> [ 33.275743] ------------[ cut here ]------------
> [ 33.275764] WARNING: CPU: 0 PID: 1617 at
> drivers/gpu/drm/i915/i915_gem.c:4808 i915_gem_suspend+0xe4/0xf0 [i915]()
> [ 33.275766] WARN_ON(dev_priv->mm.busy)
> [ 33.275811] Modules linked in: binfmt_misc snd_hda_codec_hdmi
> hid_generic isl29018(C) industrialio regmap_i2c cyapatp crc_itu_t usbhid
> hid arc4 x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi
> coretemp ath9k tpm_infineon kvm_intel kvm ath9k_common ath9k_hw
> crct10dif_pclmul crc32_pclmul crc32c_intel chromeos_laptop ath mac80211
> ghash_clmulni_intel cryptd i915 cfg80211 pcspkr serio_raw sg ath3k btusb
> btrtl lpc_ich snd_hda_codec_realtek shpchp i2c_i801 mfd_core
> snd_hda_codec_generic btbcm btintel bluetooth snd_hda_intel battery
> snd_hda_codec ac i2c_algo_bit drm_kms_helper tpm_tis snd_hwdep tpm
> snd_hda_core drm snd_pcm video rfkill processor button snd_timer snd
> soundcore i2c_designware_pci i2c_designware_core evdev uvcvideo
> videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev
> [ 33.275825] media i2c_core fuse autofs4 ext4 crc16 mbcache jbd2
> sd_mod fan xhci_pci sdhci_acpi sdhci xhci_hcd mmc_core thermal
> thermal_sys usbcore ahci libahci usb_common libata scsi_mod
> [ 33.275828] CPU: 0 PID: 1617 Comm: kworker/u4:4 Tainted: G C
> 4.1.0-rc6-next-20150604+ #207
> [ 33.275829] Hardware name: Acer Peppy, BIOS 04/30/2014
> [ 33.275834] Workqueue: events_unbound async_run_entry_fn
> [ 33.275838] 0000000000000000 ffffffffa05b7908 ffffffff8152ca4d
> ffff880035effc58
> [ 33.275840] ffffffff8106bce1 ffff880073587f20 0000000000000000
> ffff880073580000
> [ 33.275842] ffff88003534f860 ffff880073580000 ffffffff8106bd5a
> ffffffffa05c74c1
> [ 33.275843] Call Trace:
> [ 33.275849] [<ffffffff8152ca4d>] ? dump_stack+0x40/0x50
> [ 33.275853] [<ffffffff8106bce1>] ? warn_slowpath_common+0x81/0xb0
> [ 33.275855] [<ffffffff8106bd5a>] ? warn_slowpath_fmt+0x4a/0x50
> [ 33.275865] [<ffffffffa0520264>] ? i915_gem_suspend+0xe4/0xf0 [i915]
> [ 33.275872] [<ffffffffa04f48c1>] ? i915_drm_suspend+0x61/0x1b0
> [i915]
> [ 33.275876] [<ffffffff81302381>] ? pci_pm_suspend+0x71/0x140
> [ 33.275878] [<ffffffff81302310>] ? pci_pm_freeze+0xd0/0xd0
> [ 33.275881] [<ffffffff813e1699>] ? dpm_run_callback+0x39/0xd0
> [ 33.275883] [<ffffffff813e20d4>] ? __device_suspend+0xe4/0x300
> [ 33.275884] [<ffffffff813e230e>] ? async_suspend+0x1e/0x90
> [ 33.275887] [<ffffffff8108a893>] ? async_run_entry_fn+0x43/0x150
> [ 33.275890] [<ffffffff810828f8>] ? process_one_work+0x148/0x3b0
> [ 33.275892] [<ffffffff81082e8a>] ? worker_thread+0x4a/0x440
> [ 33.275895] [<ffffffff81082e40>] ? rescuer_thread+0x2e0/0x2e0
> [ 33.275898] [<ffffffff81088121>] ? kthread+0xc1/0xe0
> [ 33.275901] [<ffffffff81088060>] ?
> kthread_create_on_node+0x190/0x190
> [ 33.275904] [<ffffffff8153365f>] ? ret_from_fork+0x3f/0x70
> [ 33.275907] [<ffffffff81088060>] ?
> kthread_create_on_node+0x190/0x190
> [ 33.275908] ---[ end trace e1c3eb5e163b3520 ]---
> [ 33.560558] PM: suspend of devices complete after 423.034 msecs
> [ 33.577985] PM: late suspend of devices complete after 17.589 msecs
> [ 33.579036] xhci_hcd 0000:00:14.0: System wakeup enabled by ACPI
> [ 33.594059] PM: noirq suspend of devices complete after 16.226 msecs
> [ 33.594498] ACPI: Preparing to enter system sleep state S3
> [ 33.595066] ACPI : EC: EC stopped
> ...
>
> --
> - Jeremiah Mahler
I bisected the kernel and found that the following patch introduced the
bug.
From b47161858ba13c9c7e03333132230d66e008dd55 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris at chris-wilson.co.uk>
Date: Mon, 27 Apr 2015 13:41:17 +0100
Subject: [PATCH] drm/i915: Implement inter-engine read-read optimisations
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently, we only track the last request globally across all engines.
This prevents us from issuing concurrent read requests on e.g. the RCS
and BCS engines (or more likely the render and media engines). Without
semaphores, we incur costly stalls as we synchronise between rings -
greatly impacting the current performance of Broadwell versus Haswell in
certain workloads (like video decode). With the introduction of
reference counted requests, it is much easier to track the last request
per ring, as well as the last global write request so that we can
optimise inter-engine read read requests (as well as better optimise
certain CPU waits).
v2: Fix inverted readonly condition for nonblocking waits.
v3: Handle non-continguous engine array after waits
v4: Rebase, tidy, rewrite ring list debugging
v5: Use obj->active as a bitfield, it looks cool
v6: Micro-optimise, mostly involving moving code around
v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf)
v8: Rebase
v9: Refactor i915_gem_object_sync() to allow the compiler to better
optimise it.
Benchmark: igt/gem_read_read_speed
hsw:gt3e (with semaphores):
Before: Time to read-read 1024k: 275.794µs
After: Time to read-read 1024k: 123.260µs
hsw:gt3e (w/o semaphores):
Before: Time to read-read 1024k: 230.433µs
After: Time to read-read 1024k: 124.593µs
bdw-u (w/o semaphores): Before After
Time to read-read 1x1: 26.274µs 10.350µs
Time to read-read 128x128: 40.097µs 21.366µs
Time to read-read 256x256: 77.087µs 42.608µs
Time to read-read 512x512: 281.999µs 181.155µs
Time to read-read 1024x1024: 1196.141µs 1118.223µs
Time to read-read 2048x2048: 5639.072µs 5225.837µs
Time to read-read 4096x4096: 22401.662µs 21137.067µs
Time to read-read 8192x8192: 89617.735µs 85637.681µs
Testcase: igt/gem_concurrent_blit (read-read and friends)
Cc: Lionel Landwerlin <lionel.g.landwerlin at linux.intel.com>
Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com> [v8]
[danvet: s/\<rq\>/req/g]
Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
---
drivers/gpu/drm/i915/i915_debugfs.c | 16 +-
drivers/gpu/drm/i915/i915_drv.h | 19 +-
drivers/gpu/drm/i915/i915_gem.c | 540 ++++++++++++++++++++------------
drivers/gpu/drm/i915/i915_gem_context.c | 2 -
drivers/gpu/drm/i915/i915_gem_debug.c | 92 ++----
drivers/gpu/drm/i915/i915_gpu_error.c | 19 +-
drivers/gpu/drm/i915/intel_display.c | 6 +-
drivers/gpu/drm/i915/intel_lrc.c | 19 +-
drivers/gpu/drm/i915/intel_overlay.c | 2 -
drivers/gpu/drm/i915/intel_ringbuffer.c | 26 +-
10 files changed, 416 insertions(+), 325 deletions(-)
...
--
- Jeremiah Mahler
More information about the Intel-gfx
mailing list