[PATCH] drm/syncobj: make lockdep complain on WAIT_FOR_SUBMIT
Daniel Vetter
daniel at ffwll.ch
Fri Jan 15 13:52:42 UTC 2021
On Fri, Jan 15, 2021 at 02:35:50PM +0100, Christian König wrote:
> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT can't be used when a reservation
> object lock is help or otherwise we can deadlock with page faults.
>
> Make lockdep complain badly about that.
>
> Signed-off-by: Christian König <christian.koenig at amd.com>
> ---
> drivers/gpu/drm/drm_syncobj.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> index 6e74e6745eca..6228e9cd089a 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -387,6 +387,20 @@ int drm_syncobj_find_fence(struct drm_file *file_private,
> if (!syncobj)
> return -ENOENT;
>
> + if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT &&
> + IS_ENABLED(CONFIG_LOCKDEP)) {
> + struct dma_resv robj;
> +
> + /* Waiting for userspace with a reservation lock help is illegal
> + * cause that can deadlock with page faults. Make lockdep
> + * complain about it early on.
Not sure this is a good enough explanation, since anything that holds up
userspace can result in a functional deadlock (i.e. user observes no
forward progress, gets angry and decides that our gpu driver stack is
garbage). It's by far not pagefault.
I'd put something like
/* We must not impede forward progress of userspace in any
* way, for otherwise the future fence never materializes
* and the application grinds to a full halt. Check for
* the worst offenders in terms of locking issues.
*/
Feel free to bikeshed further.
> + */
> + dma_resv_init(&robj);
> + dma_resv_lock(&robj, NULL);
> + dma_resv_unlock(&robj);
> + dma_resv_fini(&robj);
I think you want to go stronger, since it's not just dma_resv, it's
holding anything that might hold up userspace that's illegal here. A
lockdep_assert_no_locks_held might be ideal, but a good second-best option
would be to grab mmap_lock. Since dma_resv (and a lot of other things,
like gup in general) nest within that it would be a substantially stronger
asssertion.
Specifically this should also go boom when you do it in places like
serving (hmm) page faults, which I think we want. Just locking
dma_resv_lock wont go boom like that (since taking the dma_resv_lock from
a page fault handler is explicitly allowed, it nests within mmap_lock).
Conceptually I think it's otherwise all fine and at the right spot.
-Daniel
> + }
> +
> *fence = drm_syncobj_fence_get(syncobj);
> drm_syncobj_put(syncobj);
>
> --
> 2.25.1
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the dri-devel
mailing list