[Bug 110927] New: [CI][SHARDS] igt at gem_exec_schedule@semaphore-resolve - dmesg-fail - WARNING: possible circular locking dependency detected
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Mon Jun 17 05:31:59 UTC 2019
https://bugs.freedesktop.org/show_bug.cgi?id=110927
Bug ID: 110927
Summary: [CI][SHARDS] igt at gem_exec_schedule@semaphore-resolve -
dmesg-fail - WARNING: possible circular locking
dependency detected
Product: DRI
Version: DRI git
Hardware: Other
OS: All
Status: NEW
Severity: normal
Priority: medium
Component: DRM/Intel
Assignee: intel-gfx-bugs at lists.freedesktop.org
Reporter: martin.peres at free.fr
QA Contact: intel-gfx-bugs at lists.freedesktop.org
CC: intel-gfx-bugs at lists.freedesktop.org
The test went from failing
(https://bugs.freedesktop.org/show_bug.cgi?id=110519) to failing with a
WARNING:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6272/shard-skl9/igt@gem_exec_schedule@semaphore-resolve.html
Starting subtest: semaphore-resolve
(gem_exec_schedule:2891) igt_aux-CRITICAL: Test assertion failure function
sig_abort, file ../lib/igt_aux.c:502:
(gem_exec_schedule:2891) igt_aux-CRITICAL: Failed assertion: !"GPU hung"
Subtest semaphore-resolve failed.
<6> [759.522330] i915 0000:00:02.0: GPU HANG: ecode 9:1:0xfffffffe, in
gem_exec_schedu [2891], hang on rcs0
<6> [759.522623] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
stack, including userspace.
<6> [759.522645] [drm] Please file a _new_ bug report on bugs.freedesktop.org
against DRI -> DRM/Intel
<6> [759.522664] [drm] drm/i915 developers can then reassign to the right
component if it's not a kernel issue.
<6> [759.522681] [drm] The gpu crash dump is required to analyze gpu hangs, so
please always attach it.
<6> [759.522700] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<7> [759.532843] [drm:i915_reset_device [i915]] resetting chip
<5> [759.538823] i915 0000:00:02.0: Resetting chip for hang on rcs0
<4> [759.540201]
<4> [759.540227] ======================================================
<4> [759.540259] WARNING: possible circular locking dependency detected
<4> [759.540297] 5.2.0-rc4-CI-CI_DRM_6272+ #1 Tainted: G U
<4> [759.540327] ------------------------------------------------------
<4> [759.540361] kworker/0:1/2700 is trying to acquire lock:
<4> [759.540392] 000000005112029f (wakeref#2/1){+.+.}, at:
__intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.540717]
but task is already holding lock:
<4> [759.540748] 000000000de18043 (i915.reset){+.+.}, at: i915_reset+0x57/0x3d0
[i915]
<4> [759.541087]
which lock already depends on the new lock.
<4> [759.541126]
the existing dependency chain (in reverse order) is:
<4> [759.541160]
-> #4 (i915.reset){+.+.}:
<4> [759.541551] i915_request_wait+0x16f/0x940 [i915]
<4> [759.541922] i915_gem_wait_for_idle+0xf5/0x570 [i915]
<4> [759.542275] i915_gem_shrink+0x4b0/0x630 [i915]
<4> [759.542609] i915_gem_shrink_all+0x2c/0x50 [i915]
<4> [759.542917] i915_drop_caches_set+0x1de/0x270 [i915]
<4> [759.542958] simple_attr_write+0xb0/0xd0
<4> [759.542996] full_proxy_write+0x51/0x80
<4> [759.543031] vfs_write+0xbd/0x1b0
<4> [759.543062] ksys_write+0x8f/0xe0
<4> [759.543093] do_syscall_64+0x55/0x1c0
<4> [759.543129] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [759.543159]
-> #3 (&dev->struct_mutex/1){+.+.}:
<4> [759.543522] i915_gem_shrinker_taints_mutex+0x52/0xe0 [i915]
<4> [759.543889] i915_address_space_init+0x59/0x120 [i915]
<4> [759.544255] i915_ggtt_init_hw+0x50/0x150 [i915]
<4> [759.544526] i915_driver_load+0xebb/0x18b0 [i915]
<4> [759.544801] i915_pci_probe+0x3f/0x1a0 [i915]
<4> [759.544836] pci_device_probe+0x9e/0x120
<4> [759.544870] really_probe+0xea/0x3c0
<4> [759.544901] driver_probe_device+0x10b/0x120
<4> [759.544936] device_driver_attach+0x4a/0x50
<4> [759.544970] __driver_attach+0x97/0x130
<4> [759.545001] bus_for_each_dev+0x74/0xc0
<4> [759.545032] bus_add_driver+0x13f/0x210
<4> [759.545066] driver_register+0x56/0xe0
<4> [759.545097] do_one_initcall+0x58/0x300
<4> [759.545132] do_init_module+0x56/0x1f6
<4> [759.545164] load_module+0x24d1/0x2990
<4> [759.545196] __se_sys_finit_module+0xd3/0xf0
<4> [759.545229] do_syscall_64+0x55/0x1c0
<4> [759.545264] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [759.545293]
-> #2 (fs_reclaim){+.+.}:
<4> [759.545344] fs_reclaim_acquire.part.24+0x24/0x30
<4> [759.545381] kmem_cache_alloc_trace+0x2a/0x290
<4> [759.545730] i915_gem_object_get_pages_stolen+0xa2/0x130 [i915]
<4> [759.546084] ____i915_gem_object_get_pages+0x1d/0xa0 [i915]
<4> [759.546436] __i915_gem_object_get_pages+0x59/0xb0 [i915]
<4> [759.546788] _i915_gem_object_create_stolen+0xd4/0x100 [i915]
<4> [759.547143]
i915_gem_object_create_stolen_for_preallocated+0xf0/0x550 [i915]
<4> [759.547548] intel_alloc_initial_plane_obj.isra.119+0xc6/0x1d0
[i915]
<4> [759.547954] intel_modeset_init+0x8b8/0x1a20 [i915]
<4> [759.548223] i915_driver_load+0xdb1/0x18b0 [i915]
<4> [759.548497] i915_pci_probe+0x3f/0x1a0 [i915]
<4> [759.548533] pci_device_probe+0x9e/0x120
<4> [759.548567] really_probe+0xea/0x3c0
<4> [759.548599] driver_probe_device+0x10b/0x120
<4> [759.548639] device_driver_attach+0x4a/0x50
<4> [759.548673] __driver_attach+0x97/0x130
<4> [759.548706] bus_for_each_dev+0x74/0xc0
<4> [759.548737] bus_add_driver+0x13f/0x210
<4> [759.548771] driver_register+0x56/0xe0
<4> [759.548803] do_one_initcall+0x58/0x300
<4> [759.548836] do_init_module+0x56/0x1f6
<4> [759.548869] load_module+0x24d1/0x2990
<4> [759.548902] __se_sys_finit_module+0xd3/0xf0
<4> [759.548936] do_syscall_64+0x55/0x1c0
<4> [759.548970] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [759.549001]
-> #1 (&obj->mm.lock){+.+.}:
<4> [759.549046] __mutex_lock+0x8c/0x960
<4> [759.549391] i915_gem_object_pin_map+0x2d/0x2a0 [i915]
<4> [759.549701] __engine_unpark+0x42/0x80 [i915]
<4> [759.549992] __intel_wakeref_get_first+0x40/0xa0 [i915]
<4> [759.550369] i915_request_create+0x101/0x240 [i915]
<4> [759.550709] i915_gem_do_execbuffer+0xb07/0x20f0 [i915]
<4> [759.551050] i915_gem_execbuffer2_ioctl+0x11b/0x430 [i915]
<4> [759.551090] drm_ioctl_kernel+0x83/0xf0
<4> [759.551123] drm_ioctl+0x2f3/0x3b0
<4> [759.551153] do_vfs_ioctl+0xa0/0x6e0
<4> [759.551184] ksys_ioctl+0x35/0x60
<4> [759.551214] __x64_sys_ioctl+0x11/0x20
<4> [759.551246] do_syscall_64+0x55/0x1c0
<4> [759.551280] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [759.551309]
-> #0 (wakeref#2/1){+.+.}:
<4> [759.551363] lock_acquire+0xa6/0x1c0
<4> [759.551394] __mutex_lock+0x8c/0x960
<4> [759.551679] __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.551999] reset_prepare_engine+0x9/0x30 [i915]
<4> [759.552317] reset_prepare+0x29/0x40 [i915]
<4> [759.552636] i915_reset+0x9f/0x3d0 [i915]
<4> [759.552953] i915_reset_device+0xf2/0x170 [i915]
<4> [759.553272] i915_handle_error+0x231/0x370 [i915]
<4> [759.553580] i915_hangcheck_elapsed+0x41c/0x530 [i915]
<4> [759.553620] process_one_work+0x245/0x610
<4> [759.553651] worker_thread+0x37/0x380
<4> [759.553684] kthread+0x119/0x130
<4> [759.553716] ret_from_fork+0x3a/0x50
<4> [759.553743]
other info that might help us debug this:
<4> [759.553784] Chain exists of:
wakeref#2/1 --> &dev->struct_mutex/1 --> i915.reset
<4> [759.553852] Possible unsafe locking scenario:
<4> [759.553884] CPU0 CPU1
<4> [759.553909] ---- ----
<4> [759.553934] lock(i915.reset);
<4> [759.553959] lock(&dev->struct_mutex/1);
<4> [759.553996] lock(i915.reset);
<4> [759.554028] lock(wakeref#2/1);
<4> [759.554059]
*** DEADLOCK ***
<4> [759.554095] 4 locks held by kworker/0:1/2700:
<4> [759.554120] #0: 00000000c7971d7e ((wq_completion)events_long){+.+.}, at:
process_one_work+0x1bf/0x610
<4> [759.554182] #1: 000000003ad9146c
((work_completion)(&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at:
process_one_work+0x1bf/0x610
<4> [759.554252] #2: 000000005f0d6d67
(&dev_priv->gpu_error.wedge_mutex){+.+.}, at: i915_reset_device+0xe4/0x170
[i915]
<4> [759.554599] #3: 000000000de18043 (i915.reset){+.+.}, at:
i915_reset+0x57/0x3d0 [i915]
<4> [759.554941]
stack backtrace:
<4> [759.554979] CPU: 0 PID: 2700 Comm: kworker/0:1 Tainted: G U
5.2.0-rc4-CI-CI_DRM_6272+ #1
<4> [759.555025] Hardware name: Google Caroline/Caroline, BIOS MrChromebox
08/27/2018
<4> [759.555345] Workqueue: events_long i915_hangcheck_elapsed [i915]
<4> [759.555381] Call Trace:
<4> [759.555413] dump_stack+0x67/0x9b
<4> [759.555453] print_circular_bug+0x1c8/0x2b0
<4> [759.555493] __lock_acquire+0x1ce9/0x24c0
<4> [759.555537] ? wake_up_klogd+0x4a/0x60
<4> [759.555570] ? __bfs+0xe8/0x220
<4> [759.555608] ? lock_acquire+0xa6/0x1c0
<4> [759.555643] lock_acquire+0xa6/0x1c0
<4> [759.555931] ? __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.555973] __mutex_lock+0x8c/0x960
<4> [759.556257] ? __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.556564] ? intel_engine_stop_cs+0x1f/0xb0 [i915]
<4> [759.556853] ? __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.556899] ? lock_acquire+0xa6/0x1c0
<4> [759.557207] ? intel_engine_cs_mock_selftests+0x20/0x20 [i915]
<4> [759.557497] ? __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.557786] __intel_wakeref_get_first+0x24/0xa0 [i915]
<4> [759.557828] ? _raw_spin_unlock_irqrestore+0x4c/0x60
<4> [759.558149] reset_prepare_engine+0x9/0x30 [i915]
<4> [759.558470] reset_prepare+0x29/0x40 [i915]
<4> [759.558789] i915_reset+0x9f/0x3d0 [i915]
<4> [759.559113] i915_reset_device+0xf2/0x170 [i915]
<4> [759.559439] ? i915_gem_set_wedged+0x60/0x60 [i915]
<4> [759.559483] ? queue_work_node+0x70/0x70
<4> [759.559763] i915_handle_error+0x231/0x370 [i915]
<4> [759.559812] ? string+0x40/0x50
<4> [759.560133] i915_hangcheck_elapsed+0x41c/0x530 [i915]
<4> [759.560178] ? __drm_printfn_info+0x20/0x20
<4> [759.560222] ? __lock_acquire+0x530/0x24c0
<4> [759.560268] ? debug_object_deactivate+0x137/0x160
<4> [759.560315] ? lock_acquire+0xa6/0x1c0
<4> [759.560349] ? process_one_work+0x1bf/0x610
<4> [759.560387] process_one_work+0x245/0x610
<4> [759.560430] worker_thread+0x37/0x380
<4> [759.560465] ? process_one_work+0x610/0x610
<4> [759.560499] kthread+0x119/0x130
<4> [759.560532] ? kthread_park+0x80/0x80
<4> [759.560568] ret_from_fork+0x3a/0x50
<7> [759.561198] [drm:i915_reset_request [i915]] client gem_exec_schedu[2891]:
gained 1 ban score, now 1
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190617/68b90d6e/attachment-0001.html>
More information about the intel-gfx-bugs
mailing list