<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [SKL] DEADLOCK: Kernel deadlocks when running gem_reset_stats@reset-stats-ctx-default."
   href="https://bugs.freedesktop.org/show_bug.cgi?id=104840">104840</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[SKL] DEADLOCK: Kernel deadlocks when running gem_reset_stats@reset-stats-ctx-default.
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DRI
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>XOrg git
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>major
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>DRM/Intel
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>intel-gfx-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>antonio.argenziano@intel.com
          </td>
        </tr>

        <tr>
          <th>QA Contact</th>
          <td>intel-gfx-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>intel-gfx-bugs@lists.freedesktop.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Description:
------
running gem_reset_stats@reset-stats-ctx-default on SKL causes a deadlock. What
I think is happening is that the test uses both gem_context_destroy() and
drop_caches_set() which will contend the struct mutex and if context destroy
gets stuck, it will occupy i915->wq -> nothing can progress because retire
cannot be scheduled -> drop_caches_set() keeps waiting for idle.

Steps:
------
1. Execute gem_reset_stats@reset-stats-ctx-default

Actual results:
------
Driver gets deadlocked, test never completes.

Expected results:
------
Test passes.

Dmesg output:
------
[ 7484.031148] [IGT] gem_reset_stats: starting subtest reset-stats-ctx-default

[ 7613.403760] INFO: task kworker/u8:3:1714 blocked for more than 120 seconds.
[ 7613.403815]       Tainted: G     U           4.15.0-rc9+ #44
[ 7613.403844] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 7613.403884] kworker/u8:3    D    0  1714      2 0x80000000
[ 7613.403999] Workqueue: i915 __i915_gem_free_work [i915]
[ 7613.404007] Call Trace:
[ 7613.404026]  ? __schedule+0x345/0xc50
[ 7613.404044]  schedule+0x39/0x90
[ 7613.404051]  schedule_preempt_disabled+0x11/0x20
[ 7613.404057]  __mutex_lock+0x3b7/0x8d0
[ 7613.404063]  ? __mutex_lock+0x122/0x8d0
[ 7613.404072]  ? trace_buffer_unlock_commit_regs+0x37/0x90
[ 7613.404151]  ? __i915_gem_free_objects+0x89/0x540 [i915]
[ 7613.404243]  __i915_gem_free_objects+0x89/0x540 [i915]
[ 7613.404319]  __i915_gem_free_work+0x51/0x90 [i915]
[ 7613.404335]  process_one_work+0x1b4/0x5d0
[ 7613.404342]  ? process_one_work+0x130/0x5d0
[ 7613.404361]  worker_thread+0x4a/0x3e0
[ 7613.404378]  kthread+0x100/0x140
[ 7613.404385]  ? process_one_work+0x5d0/0x5d0
[ 7613.404390]  ? kthread_delayed_work_timer_fn+0x80/0x80
[ 7613.404402]  ? do_group_exit+0x46/0xc0
[ 7613.404409]  ret_from_fork+0x3a/0x50
[ 7613.404437] 
               Showing all locks held in the system:
[ 7613.404447] 1 lock held by khungtaskd/39:
[ 7613.404458]  #0:  (tasklist_lock){.+.+}, at: [<0000000088c6a651>]
debug_show_all_locks+0x39/0x1b0
[ 7613.404489] 1 lock held by in:imklog/809:
[ 7613.404492]  #0:  (&f->f_pos_lock){+.+.}, at: [<00000000cf80f1c9>]
__fdget_pos+0x3f/0x50
[ 7613.404519] 1 lock held by dmesg/1652:
[ 7613.404523]  #0:  (&user->lock){+.+.}, at: [<00000000dd4aba83>]
devkmsg_read+0x3a/0x2f0
[ 7613.404543] 3 locks held by gem_reset_stats/1713:
[ 7613.404547]  #0:  (sb_writers#10){.+.+}, at: [<00000000aadbc565>]
vfs_write+0x18a/0x1c0
[ 7613.404571]  #1:  (&attr->mutex){+.+.}, at: [<000000000e818033>]
simple_attr_write+0x35/0xc0
[ 7613.404590]  #2:  (&dev->struct_mutex){+.+.}, at: [<0000000000b72f77>]
i915_drop_caches_set+0x4e/0x1a0 [i915]
[ 7613.404669] 3 locks held by kworker/u8:3/1714:
[ 7613.404672]  #0:  ((wq_completion)"i915"){+.+.}, at: [<00000000d83ffa4e>]
process_one_work+0x130/0x5d0
[ 7613.404693]  #1:  ((work_completion)(&i915->mm.free_work)){+.+.}, at:
[<00000000d83ffa4e>] process_one_work+0x130/0x5d0
[ 7613.404713]  #2:  (&dev->struct_mutex){+.+.}, at: [<000000007b02c7ef>]
__i915_gem_free_objects+0x89/0x540 [i915]

[ 7613.404795] =============================================</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the QA Contact for the bug.</li>
          <li>You are on the CC list for the bug.</li>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>