[Intel-gfx] Oops at shutdown in intel_unpin_fb_obj()

Daniel Vetter daniel at ffwll.ch
Mon Jan 30 09:38:11 UTC 2017


On Sun, Jan 29, 2017 at 11:42:32AM -0800, Linus Torvalds wrote:
> Guys, I've gotten absolutely no response to this, and the problem
> seems to still occur.
> 
> I just got a slightly different hang at shutdown, due to a kernel oops
> that seems related. It's not identical - the call trace is very
> different - but it's close.
> 
> In particular, it's once again the same NULL pointer dereference in
> "intel_unpin_fb_obj()", except this time it looked like this:
> 
>   BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
>   IP: intel_unpin_fb_obj+0x69/0xe0 [i915]
>   Oops: 0000 [#1] SMP
>   Modules linked in: fuse xt_CHECKSUM ipt_MASQUERADE
> nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
> xt_conntrack ebtable_nat ebtable_broute bridge stp llc ip6ta$
>    tpm_tis industrialio tpm_tis_core acpi_pad tpm nfsd auth_rpcgss
> nfs_acl lockd grace sunrpc dm_crypt hid_logitech_hidpp hid_logitech_dj
> i915 crct10dif_pclmul i2c_algo_bit crc32_pc$
>   CPU: 4 PID: 26173 Comm: kworker/u16:9 Tainted: G        W
> 4.10.0-rc5-00111-g49e555a932de #1
>   Hardware name: System manufacturer System Product Name/Z170-K, BIOS
> 1803 05/06/2016
>   Workqueue: i915 intel_unpin_work_fn [i915]
>   RIP: 0010:intel_unpin_fb_obj+0x69/0xe0 [i915]
>   RSP: 0000:ffffb95c4937bdc0 EFLAGS: 00010286
>   RAX: 0000000000000000 RBX: ffff96f284441340 RCX: 0000000000000000
>   RDX: ffffb95c4937bdc0 RSI: ffff96f29f273908 RDI: ffff96f284441340
>   RBP: ffffb95c4937be08 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00000000fa83b2da R11: 0000000000808111 R12: ffff96f20d878500
>   R13: 0000000000000001 R14: ffff96f29f58c400 R15: ffff96f29f270068
>   FS:  0000000000000000(0000) GS:ffff96f2b6d00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000078 CR3: 000000041ff4b000 CR4: 00000000003406e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>    intel_unpin_work_fn+0x58/0x140 [i915]
>    process_one_work+0x1f1/0x480
>    worker_thread+0x48/0x4d0
>    kthread+0x101/0x140
>    ret_from_fork+0x29/0x40
>   Code: ff ff ff 74 67 48 8d 7d b8 44 89 ea 4c 89 e6 e8 ce 2c ff ff 48
> 8b 43 08 48 8d 55 b8 48 89 df 48 8d b0 08 39 00 00 e8 47 1b fc ff <48>
> 8b 50 78 48 85 d2 74 04 83 6a 20 01 48 $
>   RIP: intel_unpin_fb_obj+0x69/0xe0 [i915] RSP: ffffb95c4937bdc0
>   CR2: 0000000000000078
>   ---[ end trace afab57e9d299b42b ]---
> 
> so this time it was the worker thread that died and took the system
> down with it.
> 
> Anyway, there is something *seriously* wrong with the i915 shutdown sequence.
> 
> Now, maybe this was fixed with the recent drm pull that did have some
> i915 fixes in it, and I wasn't running on my desktop yet, but nothing
> there looks very obvious.
> 
> And once again, I'd like to note that other users of
> i915_gem_object_to_ggtt() do seem to check for a NULL vma, while
> intel_unpin_fb_obj() simply passes any potential NULL vma to
> i915_vma_unpin_fence().
> 
> Guys?

Hm, fell through the cracks somehow :( It's the vma tracking mixup, which
is properly fixed for 4.11. We're not handling the different flavours of
gpu mappings correctly, so if you mix tiling (because of the partial mmap
stuff we've enabled recently) and rotation and stuff it eventually goes
boom. The trouble is that the proper fix also involves core drm modeset
changes, and lots of small shuffling in i915, so no way material for
-fixes. We're discussing on irc what could be done, one option might be to
disable the partial mmap stuff again to hide the bug as well as before
(trading in some userspace faults resulting in your compositor blowing up
in corner cases, but older bugs win in no-regression land). Or we shrug it
off as unlikely and accept the leak and make the WARN_ON you added silent
for 4.10.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list