[Bug 102850] [BAT][KBL] igt at gem_exec_suspend@basic-s3 - Incomplete -

Tue Sep 19 08:30:18 UTC 2017

https://bugs.freedesktop.org/show_bug.cgi?id=102850

--- Comment #2 from Chris Wilson <chris at chris-wilson.co.uk> ---
Sometimes we see

[   38.681288] ======================================================
[   38.681288] WARNING: possible circular locking dependency detected
[   38.681289] 4.14.0-rc1-CI-CI_DRM_3099+ #1 Not tainted
[   38.681290] ------------------------------------------------------
[   38.681290] rtcwake/1414 is trying to acquire lock:
[   38.681291]  ((complete)&st->done){+.+.}, at: [<ffffffff8190987d>]
wait_for_completion+0x1d/0x20
[   38.681295]
               but task is already holding lock:
[   38.681295]  (sparse_irq_lock){+.+.}, at: [<ffffffff810f2187>]
irq_lock_sparse+0x17/0x20
[   38.681298]
               which lock already depends on the new lock.

[   38.681298]
               the existing dependency chain (in reverse order) is:
[   38.681299]
               -> #1 (sparse_irq_lock){+.+.}:
[   38.681301]        __mutex_lock+0x86/0x9b0
[   38.681302]        mutex_lock_nested+0x1b/0x20
[   38.681303]        irq_lock_sparse+0x17/0x20
[   38.681304]        irq_affinity_online_cpu+0x18/0xd0
[   38.681305]        cpuhp_invoke_callback+0xa3/0x840
[   38.681306]
               -> #0 ((complete)&st->done){+.+.}:
[   38.681308]        check_prev_add+0x430/0x840
[   38.681309]        __lock_acquire+0x1420/0x15e0
[   38.681310]        lock_acquire+0xb0/0x200
[   38.681311]        wait_for_common+0x58/0x210
[   38.681311]        wait_for_completion+0x1d/0x20
[   38.681312]        takedown_cpu+0x89/0xf0
[   38.681313]        cpuhp_invoke_callback+0xa3/0x840
[   38.681314]        cpuhp_down_callbacks+0x42/0x80
[   38.681314]        _cpu_down+0xb9/0xf0
[   38.681315]        freeze_secondary_cpus+0xa3/0x390
[   38.681316]        hibernation_snapshot+0x24c/0x5f0
[   38.681317]        hibernate+0x14f/0x2b1
[   38.681318]        state_store+0xe5/0xf0
[   38.681319]        kobj_attr_store+0xf/0x20
[   38.681321]        sysfs_kf_write+0x45/0x60
[   38.681322]        kernfs_fop_write+0x124/0x1c0
[   38.681323]        __vfs_write+0x28/0x130
[   38.681324]        vfs_write+0xcb/0x1c0
[   38.681324]        SyS_write+0x49/0xb0
[   38.681326]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[   38.681326]
               other info that might help us debug this:

[   38.681326]  Possible unsafe locking scenario:

[   38.681326]        CPU0                    CPU1
[   38.681327]        ----                    ----
[   38.681327]   lock(sparse_irq_lock);
[   38.681328]                                lock((complete)&st->done);
[   38.681328]                                lock(sparse_irq_lock);
[   38.681329]   lock((complete)&st->done);
[   38.681330]
                *** DEADLOCK ***

[   38.681330] 9 locks held by rtcwake/1414:
[   38.681331]  #0:  (sb_writers#5){.+.+}, at: [<ffffffff81220161>]
vfs_write+0x171/0x1c0
[   38.681333]  #1:  (&of->mutex){+.+.}, at: [<ffffffff812a3302>]
kernfs_fop_write+0xf2/0x1c0
[   38.681335]  #2:  (kn->count#206){.+.+}, at: [<ffffffff812a330b>]
kernfs_fop_write+0xfb/0x1c0
[   38.681337]  #3:  (pm_mutex){+.+.}, at: [<ffffffff810e79b9>]
hibernate+0x59/0x2b1
[   38.681339]  #4:  (device_hotplug_lock){+.+.}, at: [<ffffffff81617ff7>]
lock_device_hotplug+0x17/0x20
[   38.681342]  #5:  (acpi_scan_lock){+.+.}, at: [<ffffffff8153b3c7>]
acpi_scan_lock_acquire+0x17/0x20
[   38.681345]  #6:  (cpu_add_remove_lock){+.+.}, at: [<ffffffff8108106e>]
freeze_secondary_cpus+0x2e/0x390
[   38.681347]  #7:  (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff810d660b>]
percpu_down_write+0x2b/0x110
[   38.681349]  #8:  (sparse_irq_lock){+.+.}, at: [<ffffffff810f2187>]
irq_lock_sparse+0x17/0x20
[   38.681351]
               stack backtrace:
[   38.681353] CPU: 2 PID: 1414 Comm: rtcwake Not tainted
4.14.0-rc1-CI-CI_DRM_3099+ #1
[   38.681353] Hardware name:                  /NUC7i5BNB, BIOS
BNKBL357.86A.0048.2017.0704.1415 07/04/2017
[   38.681354] Call Trace:
[   38.681355]  dump_stack+0x68/0x9f
[   38.681357]  print_circular_bug+0x235/0x3c0
[   38.681358]  ? lockdep_init_map_crosslock+0x20/0x20
[   38.681359]  check_prev_add+0x430/0x840
[   38.681361]  __lock_acquire+0x1420/0x15e0
[   38.681362]  ? __lock_acquire+0x1420/0x15e0
[   38.681363]  ? lockdep_init_map_crosslock+0x20/0x20
[   38.681364]  lock_acquire+0xb0/0x200
[   38.681365]  ? wait_for_completion+0x1d/0x20
[   38.681366]  wait_for_common+0x58/0x210
[   38.681367]  ? wait_for_completion+0x1d/0x20
[   38.681368]  ? cpuhp_invoke_callback+0x840/0x840
[   38.681370]  ? stop_machine_cpuslocked+0xc1/0xd0
[   38.681370]  ? cpuhp_invoke_callback+0x840/0x840
[   38.681371]  wait_for_completion+0x1d/0x20
[   38.681372]  takedown_cpu+0x89/0xf0
[   38.681373]  ? cpuhp_complete_idle_dead+0x20/0x20
[   38.681374]  cpuhp_invoke_callback+0xa3/0x840
[   38.681375]  cpuhp_down_callbacks+0x42/0x80
[   38.681376]  _cpu_down+0xb9/0xf0
[   38.681377]  freeze_secondary_cpus+0xa3/0x390
[   38.681378]  hibernation_snapshot+0x24c/0x5f0
[   38.681379]  hibernate+0x14f/0x2b1
[   38.681380]  state_store+0xe5/0xf0
[   38.681381]  kobj_attr_store+0xf/0x20
[   38.681383]  sysfs_kf_write+0x45/0x60
[   38.681384]  kernfs_fop_write+0x124/0x1c0
[   38.681385]  __vfs_write+0x28/0x130
[   38.681386]  ? rcu_read_lock_sched_held+0x7a/0x90
[   38.681387]  ? rcu_sync_lockdep_assert+0x2f/0x60
[   38.681388]  ? __sb_start_write+0x108/0x200
[   38.681389]  vfs_write+0xcb/0x1c0
[   38.681390]  SyS_write+0x49/0xb0
[   38.681391]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[   38.681392] RIP: 0033:0x7f5ed20048f0
[   38.681393] RSP: 002b:00007ffc6aa4ba98 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[   38.681394] RAX: ffffffffffffffda RBX: ffffffff81492963 RCX:
00007f5ed20048f0
[   38.681394] RDX: 0000000000000005 RSI: 00005572b5c20060 RDI:
0000000000000007
[   38.681395] RBP: ffffc9000058bf88 R08: 00005572b5c1ddc0 R09:
00007f5ed24d6700
[   38.681395] R10: 00007f5ed22cdb58 R11: 0000000000000246 R12:
00005572b5c1dce0
[   38.681396] R13: 0000000000000001 R14: 0000000000000005 R15:
0000000000000005
[   38.681398]  ? __this_cpu_preempt_check+0x13/0x20

on a prior suspend.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20170919/9c03d6a4/attachment.html>