[Intel-gfx] [PATCH] drm/i915: Only wait upon the execution timeline when unlocked

Chris Wilson chris at chris-wilson.co.uk
Fri Nov 11 13:52:21 UTC 2016


On Fri, Nov 11, 2016 at 03:23:49PM +0200, Joonas Lahtinen wrote:
> On to, 2016-11-10 at 17:36 +0000, Chris Wilson wrote:
> > In order to walk the list of all timelines, we currently require the
> > struct_mutex. We are sometimes called prior to the struct_mutex being
> > taken by the caller (i.e !I915_WAIT_LOCKED) in which case we can only
> > trust the global execution timelines (as these are owned by the device).
> > This means in the unlocked phase we can only wait upon the currently
> > executing requests and not all queued.
> > 
> > [  175.743243] general protection fault: 0000 [#1] SMP
> > [  175.743263] Modules linked in: nls_iso8859_1 intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iwlwifi aesni_intel aes_x86_64 lrw snd_soc_rt5640 gf128mul snd_soc_rl6231 snd_soc_core glue_helper snd_compress snd_pcm_dmaengine snd_hda_codec_hdmi ablk_helper snd_hda_codec_realtek cryptd snd_hda_codec_generic serio_raw cfg80211 snd_hda_intel snd_hda_codec ir_lirc_codec snd_hda_core lirc_dev snd_hwdep snd_pcm lpc_ich mei_me mei snd_seq_midi shpchp snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer rc_rc6_mce acpi_als nuvoton_cir kfifo_buf rc_core snd industrialio snd_soc_sst_acpi soundcore snd_soc_sst_match i2c_designware_platform 8250_dw i2c_designware_core dw_dmac spi_pxa2xx_platform mac_hid acpi_pad parport_pc ppdev lp parport
> > [  175.743509]  autofs4 i915 e1000e psmouse ptp pps_core xhci_pci ehci_pci ahci xhci_hcd ehci_hcd libahci video sdhci_acpi sdhci i2c_hid hid
> > [  175.743560] CPU: 2 PID: 2386 Comm: wtdg_monitor.sh Tainted: G     U          4.9.0-rc4-nightly+ #2
> > [  175.743581] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0358.2016.0606.1423 06/06/2016
> > [  175.743603] task: ffff88024509ba80 task.stack: ffffc9007bd18000
> > [  175.743618] RIP: 0010:[<ffffffffa01af29b>]  [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915]
> > [  175.743660] RSP: 0000:ffffc9007bd1b9b8  EFLAGS: 00010297
> > [  175.743674] RAX: ffff88024489d248 RBX: 0000000000000000 RCX: 0000000000000000
> > [  175.743691] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880244898000
> > [  175.743708] RBP: ffffc9007bd1b9f0 R08: 0000000000000000 R09: 0000000000000001
> > [  175.743724] R10: 00000028eaf42792 R11: 0000000000000001 R12: dead000000000100
> > [  175.743741] R13: dead000000000148 R14: ffffc9007bd1ba5f R15: 0000000000000005
> > [  175.743758] FS:  00007f2638330700(0000) GS:ffff880256d00000(0000) knlGS:0000000000000000
> > [  175.743777] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  175.743791] CR2: 00007f885c8cea40 CR3: 00000002416b5000 CR4: 00000000003406e0
> > [  175.743808] Stack:
> > [  175.743816]  ffff88024489d248 000000004509ba80 ffff880244898000 ffff88024509ba80
> > [  175.743840]  00000000ffff8b69 ffffc9007bd1ba5f ffffc9007bd1ba5e ffffc9007bd1ba28
> > [  175.743863]  ffffffffa01b661d 00000000ffffffff 0000000000000000 ffff880244898000
> > [  175.743886] Call Trace:
> > [  175.743906]  [<ffffffffa01b661d>] i915_gem_shrinker_lock_uninterruptible.constprop.5+0x5d/0xc0 [i915]
> > [  175.743937]  [<ffffffffa01b6cd0>] i915_gem_shrinker_oom+0x30/0x1b0 [i915]
> > [  175.743955]  [<ffffffff8109ca79>] notifier_call_chain+0x49/0x70
> > [  175.743971]  [<ffffffff8109cd9d>] __blocking_notifier_call_chain+0x4d/0x70
> > [  175.743988]  [<ffffffff8109cdd6>] blocking_notifier_call_chain+0x16/0x20
> > [  175.744005]  [<ffffffff811885dc>] out_of_memory+0x22c/0x480
> > [  175.744020]  [<ffffffff81205542>] __alloc_pages_slowpath+0x851/0x8ec
> > [  175.744037]  [<ffffffff8118ca51>] __alloc_pages_nodemask+0x2c1/0x310
> > [  175.744054]  [<ffffffff811d8ea8>] alloc_pages_current+0x88/0x120
> > [  175.744070]  [<ffffffff811833a4>] __page_cache_alloc+0xb4/0xc0
> > [  175.744086]  [<ffffffff811865ca>] filemap_fault+0x29a/0x500
> > [  175.744101]  [<ffffffff81299aa6>] ext4_filemap_fault+0x36/0x50
> > [  175.744117]  [<ffffffff811b3d4a>] __do_fault+0x6a/0xe0
> > [  175.744131]  [<ffffffff811b97ee>] handle_mm_fault+0xd0e/0x1330
> > [  175.744147]  [<ffffffff8106738c>] __do_page_fault+0x23c/0x4d0
> > [  175.744162]  [<ffffffff81067650>] do_page_fault+0x30/0x80
> > [  175.744177]  [<ffffffff817ffbe8>] page_fault+0x28/0x30
> > [  175.744191] Code: 41 57 41 56 41 55 41 54 53 48 83 ec 10 4c 8b a7 48 52 00 00 89 75 d4 48 89 45 c8 49 39 c4 74 78 4d 8d 6c 24 48 41 bf 05 00 00 00 <49> 8b 5d 00 48 85 db 74 50 8b 83 20 01 00 00 85 c0 74 15 48 8b
> > [  175.744320] RIP  [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915]
> > [  175.744351]  RSP <ffffc9007bd1b9b8>
> > 
> > Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines")
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> 
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>

I was expecting at least a little fight over the lack of presentation of
possible alternatives in the commitlog!

It is obviously not an ideal unlocked wait_for_idle, but for its couple
of users it seems a reasonable compromise to going full on struct_mutexless.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list