[Bug 111601] regression: deadlock-freeze due to kernel commit aa56a292ce623734ddd30f52d73f527d1f3529b5

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Sep 10 11:06:16 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=111601

howaboutsynergy <howaboutsynergy at pm.me> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #145318|0                           |1
        is obsolete|                            |

--- Comment #9 from howaboutsynergy <howaboutsynergy at pm.me> ---
Created attachment 145319
  --> https://bugs.freedesktop.org/attachment.cgi?id=145319&action=edit
got stacktraces also at the point of `callback failed with -4 in blockable
context` not just on `return -EINTR`

I just realized I could do better: get stacktraces also at this point
mm/mmu_notifier.c:179:                          pr_info("%pS callback failed
with %d in %sblockable context.\n",

using this code:
```c
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index b5670620aea0..2ec61d700e72 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -179,6 +179,7 @@ int __mmu_notifier_invalidate_range_start(struct
mmu_notifier_range *range)
                                pr_info("%pS callback failed with %d in
%sblockable context.\n",
                                        mn->ops->invalidate_range_start, _ret,
                                        !mmu_notifier_range_blockable(range) ?
"non-" : "");
+        WARN_ON(1);
                                ret = _ret;
                        }
                }
```

The first try yielded no stacktraces, not sure why, but the second try was a
success and it looks like this:
```
[  261.691955] ------------[ cut here ]------------
[  261.691961] WARN_ON(1)
[  261.692078] WARNING: CPU: 1 PID: 2240 at
drivers/gpu/drm/i915/gem/i915_gem_userptr.c:141
userptr_mn_invalidate_range_start+0x176/0x220 [i915]
[  261.692082] Modules linked in: xt_comment xt_TCPMSS iptable_mangle
iptable_security iptable_nat nf_nat iptable_raw nf_log_ipv4 nf_log_common
xt_conntrack xt_LOG xt_connlimit nf_conncount nf_conntrack nf_defrag_ipv4
xt_hashlimit xt_multiport xt_owner xt_addrtype snd_hda_codec_hdmi
snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr intel_rapl_common
x86_pkg_temp_thermal intel_powerclamp coretemp i915 i2c_algo_bit
crct10dif_pclmul drm_kms_helper crc32_pclmul snd_hda_intel crc32c_intel
snd_hda_codec syscopyarea sysfillrect snd_hwdep sysimgblt snd_hda_core
ghash_clmulni_intel fb_sys_fops drm intel_cstate iTCO_wdt snd_pcm e1000e
intel_uncore iTCO_vendor_support snd_timer intel_rapl_perf bfq snd soundcore
pcspkr i2c_i801 drm_panel_orientation_quirks xhci_pci xhci_hcd
[  261.692125] CPU: 1 PID: 2240 Comm: stress Kdump: loaded Tainted: G     U    
       5.3.0-rc8-gf74c2bb98776 #57
[  261.692128] Hardware name: System manufacturer System Product Name/PRIME
Z370-A, BIOS 2201 05/27/2019
[  261.692203] RIP: 0010:userptr_mn_invalidate_range_start+0x176/0x220 [i915]
[  261.692209] Code: ff ff ff 48 89 ef e8 c9 77 f5 c6 84 c0 74 08 48 89 ef e8
1d 3b e2 ff 48 c7 c6 81 c0 5f c0 48 c7 c7 7e c0 5f c0 e8 63 ee b7 c6 <0f> 0b 41
bf fc ff ff ff e9 97 fe ff ff be 01 00 00 00 48 89 ef e8
[  261.692213] RSP: 0018:ffffafc9c8f5f768 EFLAGS: 00010286
[  261.692217] RAX: 0000000000000000 RBX: ffffafc9c8f5f820 RCX:
0000000000000000
[  261.692220] RDX: 000000000000000a RSI: ffffffff8856c9ca RDI:
ffffffff8856d9ca
[  261.692223] RBP: ffffa42007915500 R08: 0000003cee0e71d7 R09:
000000000000000a
[  261.692225] R10: 0000000000000000 R11: 00000000fffffffe R12:
ffffa41ffd7a8068
[  261.692228] R13: ffffa42069d80c18 R14: ffffa42007a9b990 R15:
0000000000000000
[  261.692232] FS:  00007f7258f5a740(0000) GS:ffffa4206d840000(0000)
knlGS:0000000000000000
[  261.692235] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  261.692238] CR2: 0000737968747578 CR3: 00000007c7d1e006 CR4:
00000000003606e0
[  261.692241] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  261.692244] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  261.692246] Call Trace:
[  261.692261]  __mmu_notifier_invalidate_range_start+0x4f/0x90
[  261.692268]  try_to_unmap_one+0x718/0x820
[  261.692273]  rmap_walk_file+0xe4/0x250
[  261.692278]  try_to_unmap+0xc1/0xf0
[  261.692283]  ? page_remove_rmap+0x2a0/0x2a0
[  261.692287]  ? page_not_mapped+0x10/0x10
[  261.692291]  ? page_get_anon_vma+0x70/0x70
[  261.692296]  migrate_pages+0x7aa/0x9a0
[  261.692304]  ? isolate_freepages_block+0x340/0x340
[  261.692310]  ? move_freelist_tail+0xd0/0xd0
[  261.692314]  compact_zone+0x656/0xa70
[  261.692319]  compact_zone_order+0xde/0x120
[  261.692324]  try_to_compact_pages+0x187/0x240
[  261.692332]  __alloc_pages_direct_compact+0x87/0x170
[  261.692338]  __alloc_pages_slowpath+0x1f8/0xc10
[  261.692345]  ? get_page_from_freelist+0xe80/0x1330
[  261.692352]  __alloc_pages_nodemask+0x268/0x2b0
[  261.692357]  alloc_pages_vma+0xc1/0x160
[  261.692363]  do_huge_pmd_anonymous_page+0x271/0x610
[  261.692369]  __handle_mm_fault+0xbfc/0x12f0
[  261.692375]  handle_mm_fault+0xa9/0x1d0
[  261.692382]  __do_page_fault+0x23a/0x480
[  261.692388]  do_page_fault+0x1a/0x64
[  261.692395]  page_fault+0x39/0x40
[  261.692400] RIP: 0033:0x5afb2a4d7c10
[  261.692406] Code: c0 0f 84 53 02 00 00 8b 54 24 0c 31 c0 85 d2 0f 94 c0 89
04 24 41 83 fd 02 0f 8f fa 00 00 00 31 c0 4d 85 ff 7e 10 0f 1f 40 00 <c6> 04 03
5a 4c 01 f0 49 39 c7 7f f4 4d 85 e4 0f 84 f4 01 00 00 7e
[  261.692409] RSP: 002b:00007ffe9f9a6050 EFLAGS: 00010206
[  261.692413] RAX: 000000036dd5a000 RBX: 00007f6b76aa6010 RCX:
00007f725907f6fb
[  261.692415] RDX: 0000000000000001 RSI: 00000006e24b4000 RDI:
00007f6b76aa6000
[  261.692418] RBP: 00005afb2a4d8a54 R08: 00007f6b76aa6010 R09:
0000000000000000
[  261.692420] R10: 0000000000000022 R11: 00000006e24b3000 R12:
ffffffffffffffff
[  261.692423] R13: 0000000000000002 R14: 0000000000001000 R15:
00000006e24b3000
[  261.692427] ---[ end trace 4cdaeb6ba05f3a75 ]---
[  261.692502] userptr_mn_invalidate_range_start+0x0/0x220 [i915] callback
failed with -4 in blockable context.
[  261.692505] ------------[ cut here ]------------
[  261.692518] WARNING: CPU: 1 PID: 2240 at mm/mmu_notifier.c:182
__mmu_notifier_invalidate_range_start.cold+0x33/0x43
[  261.692520] Modules linked in: xt_comment xt_TCPMSS iptable_mangle
iptable_security iptable_nat nf_nat iptable_raw nf_log_ipv4 nf_log_common
xt_conntrack xt_LOG xt_connlimit nf_conncount nf_conntrack nf_defrag_ipv4
xt_hashlimit xt_multiport xt_owner xt_addrtype snd_hda_codec_hdmi
snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr intel_rapl_common
x86_pkg_temp_thermal intel_powerclamp coretemp i915 i2c_algo_bit
crct10dif_pclmul drm_kms_helper crc32_pclmul snd_hda_intel crc32c_intel
snd_hda_codec syscopyarea sysfillrect snd_hwdep sysimgblt snd_hda_core
ghash_clmulni_intel fb_sys_fops drm intel_cstate iTCO_wdt snd_pcm e1000e
intel_uncore iTCO_vendor_support snd_timer intel_rapl_perf bfq snd soundcore
pcspkr i2c_i801 drm_panel_orientation_quirks xhci_pci xhci_hcd
[  261.692554] CPU: 1 PID: 2240 Comm: stress Kdump: loaded Tainted: G     U  W 
       5.3.0-rc8-gf74c2bb98776 #57
[  261.692557] Hardware name: System manufacturer System Product Name/PRIME
Z370-A, BIOS 2201 05/27/2019
[  261.692563] RIP: 0010:__mmu_notifier_invalidate_range_start.cold+0x33/0x43
[  261.692567] Code: ec 9f ea 87 49 0f 44 cf 48 8b 43 10 48 8b 70 28 89 ea 48
c7 c7 78 9f e8 87 e8 55 f2 ef ff 48 c7 c7 c8 44 e6 87 e8 49 f2 ef ff <0f> 0b 41
89 ee e9 6f fc ff ff 90 90 90 90 90 90 53 48 83 ec 10 48
[  261.692570] RSP: 0018:ffffafc9c8f5f7c0 EFLAGS: 00010246
[  261.692574] RAX: 0000000000000024 RBX: ffffa42069d80c18 RCX:
0000000000000006
[  261.692576] RDX: 0000000000000000 RSI: 0000000000000086 RDI:
ffffa4206d8564c0
[  261.692579] RBP: 00000000fffffffc R08: ffffafc9c8f5f675 R09:
000000000000167c
[  261.692582] R10: ffffa4208e079cb4 R11: ffffafc9c8f5f675 R12:
ffffafc9c8f5f820
[  261.692584] R13: 0000000000000000 R14: 0000000000000000 R15:
ffffffff87e89f71
[  261.692588] FS:  00007f7258f5a740(0000) GS:ffffa4206d840000(0000)
knlGS:0000000000000000
[  261.692591] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  261.692593] CR2: 0000737968747578 CR3: 00000007c7d1e006 CR4:
00000000003606e0
[  261.692596] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  261.692599] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  261.692601] Call Trace:
[  261.692607]  try_to_unmap_one+0x718/0x820
[  261.692612]  rmap_walk_file+0xe4/0x250
[  261.692617]  try_to_unmap+0xc1/0xf0
[  261.692621]  ? page_remove_rmap+0x2a0/0x2a0
[  261.692625]  ? page_not_mapped+0x10/0x10
[  261.692629]  ? page_get_anon_vma+0x70/0x70
[  261.692634]  migrate_pages+0x7aa/0x9a0
[  261.692640]  ? isolate_freepages_block+0x340/0x340
[  261.692646]  ? move_freelist_tail+0xd0/0xd0
[  261.692650]  compact_zone+0x656/0xa70
[  261.692655]  compact_zone_order+0xde/0x120
[  261.692660]  try_to_compact_pages+0x187/0x240
[  261.692667]  __alloc_pages_direct_compact+0x87/0x170
[  261.692673]  __alloc_pages_slowpath+0x1f8/0xc10
[  261.692680]  ? get_page_from_freelist+0xe80/0x1330
[  261.692687]  __alloc_pages_nodemask+0x268/0x2b0
[  261.692692]  alloc_pages_vma+0xc1/0x160
[  261.692698]  do_huge_pmd_anonymous_page+0x271/0x610
[  261.692703]  __handle_mm_fault+0xbfc/0x12f0
[  261.692709]  handle_mm_fault+0xa9/0x1d0
[  261.692715]  __do_page_fault+0x23a/0x480
[  261.692721]  do_page_fault+0x1a/0x64
[  261.692727]  page_fault+0x39/0x40
[  261.692730] RIP: 0033:0x5afb2a4d7c10
[  261.692734] Code: c0 0f 84 53 02 00 00 8b 54 24 0c 31 c0 85 d2 0f 94 c0 89
04 24 41 83 fd 02 0f 8f fa 00 00 00 31 c0 4d 85 ff 7e 10 0f 1f 40 00 <c6> 04 03
5a 4c 01 f0 49 39 c7 7f f4 4d 85 e4 0f 84 f4 01 00 00 7e
[  261.692737] RSP: 002b:00007ffe9f9a6050 EFLAGS: 00010206
[  261.692740] RAX: 000000036dd5a000 RBX: 00007f6b76aa6010 RCX:
00007f725907f6fb
[  261.692743] RDX: 0000000000000001 RSI: 00000006e24b4000 RDI:
00007f6b76aa6000
[  261.692745] RBP: 00005afb2a4d8a54 R08: 00007f6b76aa6010 R09:
0000000000000000
[  261.692748] R10: 0000000000000022 R11: 00000006e24b3000 R12:
ffffffffffffffff
[  261.692750] R13: 0000000000000002 R14: 0000000000001000 R15:
00000006e24b3000
[  261.692754] ---[ end trace 4cdaeb6ba05f3a76 ]---
```

semi-full log (of all stacktraces encountered due to the WARN_ON(1) that I
added) is attached.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190910/caab319e/attachment-0001.html>


More information about the intel-gfx-bugs mailing list