[PATCH] drm/msm: fix splat when userspace is killed with pending atomic update
Daniel Vetter
daniel at ffwll.ch
Tue May 2 09:01:17 UTC 2017
On Fri, Apr 28, 2017 at 8:05 PM, Rob Clark <robdclark at gmail.com> wrote:
> The ->preclose() hook is a good place to block for pending atomic
> updates. We can't do this in ->postclose(), as it needs to happen
> before drm_fb_release(). Otherwise, since we have already swapped
> state (in the case of a non-blocking atomic update), this means that
> the plane_state->fb will be released and cleared before we wait for
> fences from the atomic-commit wq.
>
> There are probably more complex solutions possible. But since already
> scheduled atomic update, possibly blocking on already scheduled gpu/etc
> fences, will complete eventually (assuming nothing catches fire), so
> the sanest thing seems to be just block until already scheduled atomic
> updates complete before tearing things down.
>
> Fixes:
>
> WARNING: CPU: 1 PID: 69 at ../drivers/gpu/drm/drm_atomic_helper.c:1061 drm_atomic_helper_wait_for_fences+0xe0/0xf8
> Modules linked in:
>
> CPU: 1 PID: 69 Comm: kworker/1:1 Tainted: G W 4.11.0-rc8+ #1187
> Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> Workqueue: events drm_mode_rmfb_work_fn
> task: ffffffc036560d00 task.stack: ffffffc036550000
> PC is at drm_atomic_helper_wait_for_fences+0xe0/0xf8
> LR is at complete_commit.isra.1+0x44/0x1c0
> pc : [<ffffff80084f6040>] lr : [<ffffff800854176c>] pstate: 20000145
> sp : ffffffc036553b60
> x29: ffffffc036553b60 x28: ffffffc0264e6a00
> x27: ffffffc035659000 x26: 0000000000000000
> x25: ffffffc0240e8000 x24: 0000000000000038
> x23: 0000000000000000 x22: ffffff800858f200
> x21: ffffffc0240e8000 x20: ffffffc02f56a800
> x19: 0000000000000000 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000
> x15: 0000000000000000 x14: ffffffc00a192700
> x13: 0000000000000004 x12: 0000000000000000
> x11: ffffff80089a1690 x10: 00000000000008f0
> x9 : ffffffc036553b20 x8 : ffffffc036561650
> x7 : ffffffc03fe6cb40 x6 : 0000000000000000
> x5 : 0000000000000001 x4 : 0000000000000002
> x3 : ffffffc035659000 x2 : ffffffc0240e8c80
> x1 : 0000000000000000 x0 : ffffffc02adbe588
>
> ---[ end trace 13aeec77c3fb55e2 ]---
> Call trace:
> Exception stack(0xffffffc036553990 to 0xffffffc036553ac0)
> 3980: 0000000000000000 0000008000000000
> 39a0: ffffffc036553b60 ffffff80084f6040 0000000000004ff0 0000000000000038
> 39c0: ffffffc0365539d0 ffffff800857e098 ffffffc036553a00 ffffff800857e1b0
> 39e0: ffffffc036553a10 ffffff800857c554 ffffffc0365e8400 ffffffc0365e8400
> 3a00: ffffffc036553a20 ffffff8008103358 000000000001aad7 ffffff800851b72c
> 3a20: ffffffc036553a50 ffffff80080e9228 ffffffc02adbe588 0000000000000000
> 3a40: ffffffc0240e8c80 ffffffc035659000 0000000000000002 0000000000000001
> 3a60: 0000000000000000 ffffffc03fe6cb40 ffffffc036561650 ffffffc036553b20
> 3a80: 00000000000008f0 ffffff80089a1690 0000000000000000 0000000000000004
> 3aa0: ffffffc00a192700 0000000000000000 0000000000000000 0000000000000000
> [<ffffff80084f6040>] drm_atomic_helper_wait_for_fences+0xe0/0xf8
> [<ffffff800854176c>] complete_commit.isra.1+0x44/0x1c0
> [<ffffff8008541c64>] msm_atomic_commit+0x32c/0x350
> [<ffffff8008516230>] drm_atomic_commit+0x50/0x60
> [<ffffff8008517548>] drm_atomic_remove_fb+0x158/0x250
> [<ffffff80085186d0>] drm_framebuffer_remove+0x50/0x158
> [<ffffff8008518818>] drm_mode_rmfb_work_fn+0x40/0x58
> [<ffffff80080d5668>] process_one_work+0x1d0/0x378
> [<ffffff80080d5a54>] worker_thread+0x244/0x488
> [<ffffff80080db7fc>] kthread+0xfc/0x128
> [<ffffff8008082ec0>] ret_from_fork+0x10/0x50
>
> Reported-by: Stanimir Varbanov <stanimir.varbanov at linaro.org>
> Signed-off-by: Rob Clark <robdclark at gmail.com>
> ---
> The hunk that removes the comment about ->preclose() included in this
> patch to challenge the assumption that ->preclose() shouldn't exist ;-)
And I'm going to challenge your patch here. Both fences and
framebuffers and atomic commits are refcounted. If you go boom on them
when userspace closes the fd, you have a refcount bug. We don't fix
those by flusing stuff :-)
Please add a pair of get/put() calls at the right place instead.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the dri-devel
mailing list