[Bug 111945] New: [CI][SHARDS] igt at gem_ctx_switch@queue-heavy|igt at gem_exec_flush@basic-wb-prw-default - dmesg-warn - WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected

Wed Oct 9 19:09:33 UTC 2019

https://bugs.freedesktop.org/show_bug.cgi?id=111945

            Bug ID: 111945
           Summary: [CI][SHARDS]
                    igt at gem_ctx_switch@queue-heavy|igt at gem_exec_flush@basi
                    c-wb-prw-default - dmesg-warn - WARNING: HARDIRQ-safe
                    -> HARDIRQ-unsafe lock order detected
           Product: DRI
           Version: DRI git
          Hardware: Other
                OS: All
            Status: NEW
          Severity: not set
          Priority: not set
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: lakshminarayana.vudum at intel.com
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7033/shard-kbl7/igt@gem_ctx_switch@queue-heavy.html
<6> [1065.900423] Console: switching to colour dummy device 80x25
<6> [1065.900479] [IGT] gem_ctx_switch: executing
<5> [1065.903630] Setting dangerous option reset - tainting kernel
<6> [1065.912493] [IGT] gem_ctx_switch: starting subtest queue-heavy
<4> [1087.386382] 
<4> [1087.386392] =====================================================
<4> [1087.386403] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
<4> [1087.386416] 5.4.0-rc2-CI-CI_DRM_7033+ #1 Tainted: G     U           
<4> [1087.386429] -----------------------------------------------------
<4> [1087.386444] kworker/2:3/423 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
<4> [1087.386460] ffff88826f4250c8 (&(&lock->wait_lock)->rlock){+.+.}, at:
__mutex_unlock_slowpath+0x18e/0x2b0
<4> [1087.386488] 
and this task is already holding:
<4> [1087.386500] ffff88825875c298 (&(&timelines->lock)->rlock){-...}, at:
intel_gt_retire_requests_timeout+0x15c/0x520 [i915]
<4> [1087.386655] which would create a new lock dependency:
<4> [1087.386662]  (&(&timelines->lock)->rlock){-...} ->
(&(&lock->wait_lock)->rlock){+.+.}
<4> [1087.386677] 
but this new dependency connects a HARDIRQ-irq-safe lock:
<4> [1087.386692]  (&(&timelines->lock)->rlock){-...}
<4> [1087.386695] 
... which became HARDIRQ-irq-safe at:
<4> [1087.386724]   lock_acquire+0xa7/0x1c0
<4> [1087.386739]   _raw_spin_lock_irqsave+0x33/0x50
<4> [1087.386876]   intel_timeline_enter+0x64/0x150 [i915]
<4> [1087.387000]   __engine_park+0x1db/0x400 [i915]
<4> [1087.387120]   ____intel_wakeref_put_last+0x1c/0x70 [i915]
<4> [1087.387234]   i915_sample+0x2de/0x300 [i915]
<4> [1087.387249]   __hrtimer_run_queues+0x121/0x4a0
<4> [1087.387262]   hrtimer_interrupt+0xea/0x250
<4> [1087.387276]   smp_apic_timer_interrupt+0x96/0x280
<4> [1087.387289]   apic_timer_interrupt+0xf/0x20
<4> [1087.387303]   cpuidle_enter_state+0xb2/0x450
<4> [1087.387315]   cpuidle_enter+0x24/0x40
<4> [1087.387326]   do_idle+0x1e7/0x250
<4> [1087.387336]   cpu_startup_entry+0x14/0x20
<4> [1087.387347]   start_kernel+0x4d2/0x4f4
<4> [1087.387357]   secondary_startup_64+0xa4/0xb0
<4> [1087.387368] 
to a HARDIRQ-irq-unsafe lock:
<4> [1087.387381]  (&(&lock->wait_lock)->rlock){+.+.}
<4> [1087.387384] 
... which became HARDIRQ-irq-unsafe at:
<4> [1087.387410] ...
<4> [1087.387416]   lock_acquire+0xa7/0x1c0
<4> [1087.387434]   _raw_spin_lock+0x2a/0x40
<4> [1087.387447]   __mutex_lock+0x198/0x9d0
<4> [1087.387461]   pipe_wait+0x8f/0xc0
<4> [1087.387470]   pipe_read+0x235/0x310
<4> [1087.387480]   new_sync_read+0x10f/0x1a0
<4> [1087.387490]   vfs_read+0x96/0x160
<4> [1087.387497]   ksys_read+0x9f/0xe0
<4> [1087.387509]   do_syscall_64+0x4f/0x210
<4> [1087.387523]   entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [1087.387535] 
other info that might help us debug this:

<4> [1087.387553]  Possible interrupt unsafe locking scenario:

<4> [1087.387568]        CPU0                    CPU1
<4> [1087.387580]        ----                    ----
<4> [1087.387590]   lock(&(&lock->wait_lock)->rlock);
<4> [1087.387601]                                local_irq_disable();
<4> [1087.387610]                               
lock(&(&timelines->lock)->rlock);
<4> [1087.387621]                               
lock(&(&lock->wait_lock)->rlock);
<4> [1087.387632]   <Interrupt>
<4> [1087.387637]     lock(&(&timelines->lock)->rlock);
<4> [1087.387646] 
 *** DEADLOCK ***

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7039/shard-skl3/igt@gem_exec_flush@basic-wb-prw-default.html
<4> [2139.620735] =====================================================
<4> [2139.620762] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
<4> [2139.620795] 5.4.0-rc2-CI-CI_DRM_7039+ #1 Tainted: G     U           
<4> [2139.620828] -----------------------------------------------------
<4> [2139.620857] kworker/1:0/5423 [HC0[0]:SC0[0]:HE0:SE1] is trying to
acquire:
<4> [2139.620882] ffff888176b25e08 (&(&lock->wait_lock)->rlock){+.+.}, at:
__mutex_unlock_slowpath+0xa6/0x2b0
<4> [2139.620936] 
and this task is already holding:
<4> [2139.620958] ffff88816d5fc288 (&(&timelines->lock)->rlock){-...}, at:
intel_gt_retire_requests_timeout+0x17d/0x540 [i915]
<4> [2139.621273] which would create a new lock dependency:
<4> [2139.621291]  (&(&timelines->lock)->rlock){-...} ->
(&(&lock->wait_lock)->rlock){+.+.}
<4> [2139.621328] 
but this new dependency connects a HARDIRQ-irq-safe lock:
<4> [2139.621354]  (&(&timelines->lock)->rlock){-...}
<4> [2139.621360] 
... which became HARDIRQ-irq-safe at:
<4> [2139.621410]   lock_acquire+0xa7/0x1c0
<4> [2139.621436]   _raw_spin_lock_irqsave+0x33/0x50
<4> [2139.621732]   intel_timeline_enter+0x64/0x150 [i915]
<4> [2139.622007]   __engine_park+0x1db/0x400 [i915]
<4> [2139.622258]   ____intel_wakeref_put_last+0x1c/0x70 [i915]
<4> [2139.622513]   i915_sample+0x2de/0x300 [i915]
<4> [2139.622539]   __hrtimer_run_queues+0x121/0x4a0
<4> [2139.622562]   hrtimer_interrupt+0xea/0x250
<4> [2139.622586]   smp_apic_timer_interrupt+0x96/0x280
<4> [2139.622610]   apic_timer_interrupt+0xf/0x20
<4> [2139.622634]   cpuidle_enter_state+0xb2/0x450
<4> [2139.622656]   cpuidle_enter+0x24/0x40
<4> [2139.622682]   do_idle+0x1e7/0x250
<4> [2139.622706]   cpu_startup_entry+0x14/0x20
<4> [2139.622735]   start_kernel+0x4d2/0x4f4
<4> [2139.622757]   secondary_startup_64+0xa4/0xb0
<4> [2139.622775] 
to a HARDIRQ-irq-unsafe lock:
<4> [2139.622806]  (&(&lock->wait_lock)->rlock){+.+.}
<4> [2139.622812] 
... which became HARDIRQ-irq-unsafe at:
<4> [2139.622852] ...
<4> [2139.622864]   lock_acquire+0xa7/0x1c0
<4> [2139.622897]   _raw_spin_lock+0x2a/0x40
<4> [2139.622920]   __mutex_lock+0x198/0x9d0
<4> [2139.622943]   hub_port_init+0x70/0xcd0
<4> [2139.622965]   hub_event+0x797/0x16d0
<4> [2139.622987]   process_one_work+0x26a/0x620
<4> [2139.623009]   worker_thread+0x37/0x380
<4> [2139.623033]   kthread+0x119/0x130
<4> [2139.623056]   ret_from_fork+0x3a/0x50
<4> [2139.623072] 
other info that might help us debug this:

<4> [2139.623102]  Possible interrupt unsafe locking scenario:

<4> [2139.623127]        CPU0                    CPU1
<4> [2139.623145]        ----                    ----
<4> [2139.623163]   lock(&(&lock->wait_lock)->rlock);
<4> [2139.623185]                                local_irq_disable();
<4> [2139.623206]                               
lock(&(&timelines->lock)->rlock);
<4> [2139.623234]                               
lock(&(&lock->wait_lock)->rlock);
<4> [2139.623261]   <Interrupt>
<4> [2139.623274]     lock(&(&timelines->lock)->rlock);
<4> [2139.623296] 
 *** DEADLOCK ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20191009/4031be58/attachment-0001.html>