[Intel-gfx] [BUG, bisect] Re: drm/i915: WARN_ON(dev_priv->mm.busy)

Jeremiah Mahler jmmahler at gmail.com
Sun Jun 7 22:56:41 PDT 2015


all,

On Sat, Jun 06, 2015 at 08:09:34PM -0700, Jeremiah Mahler wrote:
> all,
> 
> On all my machines with Intel graphics I get the following warning
> in the logs when the machine is suspended.  Apparently some part of
> the graphics system is busy when it should be idle. This is present
> on the latest linux-next 20150604.
> 
>   ...
>   [   33.141747] Suspending console(s) (use no_console_suspend to debug)
>   [   33.142146] wlan0: deauthenticating from 00:1a:70:5a:6e:0b by local
>   choice (Reason: 3=DEAUTH_LEAVING)
>   [   33.147395] queueing ieee80211 work while going to suspend
>   [   33.151597] cfg80211: Calling CRDA to update world regulatory domain
>   [   33.190430] sd 0:0:0:0: [sda] Synchronizing SCSI cache
>   [   33.190523] sd 0:0:0:0: [sda] Stopping disk
>   [   33.275743] ------------[ cut here ]------------
>   [   33.275764] WARNING: CPU: 0 PID: 1617 at
>   drivers/gpu/drm/i915/i915_gem.c:4808 i915_gem_suspend+0xe4/0xf0 [i915]()
>   [   33.275766] WARN_ON(dev_priv->mm.busy)
>   [   33.275811] Modules linked in: binfmt_misc snd_hda_codec_hdmi
>   hid_generic isl29018(C) industrialio regmap_i2c cyapatp crc_itu_t usbhid
>   hid arc4 x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi
>   coretemp ath9k tpm_infineon kvm_intel kvm ath9k_common ath9k_hw
>   crct10dif_pclmul crc32_pclmul crc32c_intel chromeos_laptop ath mac80211
>   ghash_clmulni_intel cryptd i915 cfg80211 pcspkr serio_raw sg ath3k btusb
>   btrtl lpc_ich snd_hda_codec_realtek shpchp i2c_i801 mfd_core
>   snd_hda_codec_generic btbcm btintel bluetooth snd_hda_intel battery
>   snd_hda_codec ac i2c_algo_bit drm_kms_helper tpm_tis snd_hwdep tpm
>   snd_hda_core drm snd_pcm video rfkill processor button snd_timer snd
>   soundcore i2c_designware_pci i2c_designware_core evdev uvcvideo
>   videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev
>   [   33.275825]  media i2c_core fuse autofs4 ext4 crc16 mbcache jbd2
>   sd_mod fan xhci_pci sdhci_acpi sdhci xhci_hcd mmc_core thermal
>   thermal_sys usbcore ahci libahci usb_common libata scsi_mod
>   [   33.275828] CPU: 0 PID: 1617 Comm: kworker/u4:4 Tainted: G         C
>   4.1.0-rc6-next-20150604+ #207
>   [   33.275829] Hardware name: Acer Peppy, BIOS          04/30/2014
>   [   33.275834] Workqueue: events_unbound async_run_entry_fn
>   [   33.275838]  0000000000000000 ffffffffa05b7908 ffffffff8152ca4d
>   ffff880035effc58
>   [   33.275840]  ffffffff8106bce1 ffff880073587f20 0000000000000000
>   ffff880073580000
>   [   33.275842]  ffff88003534f860 ffff880073580000 ffffffff8106bd5a
>   ffffffffa05c74c1
>   [   33.275843] Call Trace:
>   [   33.275849]  [<ffffffff8152ca4d>] ? dump_stack+0x40/0x50
>   [   33.275853]  [<ffffffff8106bce1>] ? warn_slowpath_common+0x81/0xb0
>   [   33.275855]  [<ffffffff8106bd5a>] ? warn_slowpath_fmt+0x4a/0x50
>   [   33.275865]  [<ffffffffa0520264>] ? i915_gem_suspend+0xe4/0xf0 [i915]
>   [   33.275872]  [<ffffffffa04f48c1>] ? i915_drm_suspend+0x61/0x1b0
>   [i915]
>   [   33.275876]  [<ffffffff81302381>] ? pci_pm_suspend+0x71/0x140
>   [   33.275878]  [<ffffffff81302310>] ? pci_pm_freeze+0xd0/0xd0
>   [   33.275881]  [<ffffffff813e1699>] ? dpm_run_callback+0x39/0xd0
>   [   33.275883]  [<ffffffff813e20d4>] ? __device_suspend+0xe4/0x300
>   [   33.275884]  [<ffffffff813e230e>] ? async_suspend+0x1e/0x90
>   [   33.275887]  [<ffffffff8108a893>] ? async_run_entry_fn+0x43/0x150
>   [   33.275890]  [<ffffffff810828f8>] ? process_one_work+0x148/0x3b0
>   [   33.275892]  [<ffffffff81082e8a>] ? worker_thread+0x4a/0x440
>   [   33.275895]  [<ffffffff81082e40>] ? rescuer_thread+0x2e0/0x2e0
>   [   33.275898]  [<ffffffff81088121>] ? kthread+0xc1/0xe0
>   [   33.275901]  [<ffffffff81088060>] ?
>   kthread_create_on_node+0x190/0x190
>   [   33.275904]  [<ffffffff8153365f>] ? ret_from_fork+0x3f/0x70
>   [   33.275907]  [<ffffffff81088060>] ?
>   kthread_create_on_node+0x190/0x190
>   [   33.275908] ---[ end trace e1c3eb5e163b3520 ]---
>   [   33.560558] PM: suspend of devices complete after 423.034 msecs
>   [   33.577985] PM: late suspend of devices complete after 17.589 msecs
>   [   33.579036] xhci_hcd 0000:00:14.0: System wakeup enabled by ACPI
>   [   33.594059] PM: noirq suspend of devices complete after 16.226 msecs
>   [   33.594498] ACPI: Preparing to enter system sleep state S3
>   [   33.595066] ACPI : EC: EC stopped
>   ...
> 
> -- 
> - Jeremiah Mahler

I bisected the kernel and found that the following patch introduced the
bug.

  From b47161858ba13c9c7e03333132230d66e008dd55 Mon Sep 17 00:00:00 2001
  From: Chris Wilson <chris at chris-wilson.co.uk>
  Date: Mon, 27 Apr 2015 13:41:17 +0100
  Subject: [PATCH] drm/i915: Implement inter-engine read-read optimisations
  MIME-Version: 1.0
  Content-Type: text/plain; charset=UTF-8
  Content-Transfer-Encoding: 8bit
  
  Currently, we only track the last request globally across all engines.
  This prevents us from issuing concurrent read requests on e.g. the RCS
  and BCS engines (or more likely the render and media engines). Without
  semaphores, we incur costly stalls as we synchronise between rings -
  greatly impacting the current performance of Broadwell versus Haswell in
  certain workloads (like video decode). With the introduction of
  reference counted requests, it is much easier to track the last request
  per ring, as well as the last global write request so that we can
  optimise inter-engine read read requests (as well as better optimise
  certain CPU waits).
  
  v2: Fix inverted readonly condition for nonblocking waits.
  v3: Handle non-continguous engine array after waits
  v4: Rebase, tidy, rewrite ring list debugging
  v5: Use obj->active as a bitfield, it looks cool
  v6: Micro-optimise, mostly involving moving code around
  v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf)
  v8: Rebase
  v9: Refactor i915_gem_object_sync() to allow the compiler to better
  optimise it.
  
  Benchmark: igt/gem_read_read_speed
  hsw:gt3e (with semaphores):
  Before: Time to read-read 1024k:		275.794µs
  After:  Time to read-read 1024k:		123.260µs
  
  hsw:gt3e (w/o semaphores):
  Before: Time to read-read 1024k:		230.433µs
  After:  Time to read-read 1024k:		124.593µs
  
  bdw-u (w/o semaphores):             Before          After
  Time to read-read 1x1:            26.274µs       10.350µs
  Time to read-read 128x128:        40.097µs       21.366µs
  Time to read-read 256x256:        77.087µs       42.608µs
  Time to read-read 512x512:       281.999µs      181.155µs
  Time to read-read 1024x1024:    1196.141µs     1118.223µs
  Time to read-read 2048x2048:    5639.072µs     5225.837µs
  Time to read-read 4096x4096:   22401.662µs    21137.067µs
  Time to read-read 8192x8192:   89617.735µs    85637.681µs
  
  Testcase: igt/gem_concurrent_blit (read-read and friends)
  Cc: Lionel Landwerlin <lionel.g.landwerlin at linux.intel.com>
  Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
  Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
  Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com> [v8]
  [danvet: s/\<rq\>/req/g]
  Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
  ---
   drivers/gpu/drm/i915/i915_debugfs.c     |  16 +-
   drivers/gpu/drm/i915/i915_drv.h         |  19 +-
   drivers/gpu/drm/i915/i915_gem.c         | 540 ++++++++++++++++++++------------
   drivers/gpu/drm/i915/i915_gem_context.c |   2 -
   drivers/gpu/drm/i915/i915_gem_debug.c   |  92 ++----
   drivers/gpu/drm/i915/i915_gpu_error.c   |  19 +-
   drivers/gpu/drm/i915/intel_display.c    |   6 +-
   drivers/gpu/drm/i915/intel_lrc.c        |  19 +-
   drivers/gpu/drm/i915/intel_overlay.c    |   2 -
   drivers/gpu/drm/i915/intel_ringbuffer.c |  26 +-
   10 files changed, 416 insertions(+), 325 deletions(-)
  ...

-- 
- Jeremiah Mahler


More information about the Intel-gfx mailing list