[Intel-gfx] [PATCH] drm/i915: Prevent recursive deadlock on releasing a busy userptr

Jani Nikula jani.nikula at linux.intel.com
Tue Sep 2 14:48:42 CEST 2014


On Tue, 02 Sep 2014, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> On Tue, Sep 02, 2014 at 11:35:00AM +0200, Daniel Vetter wrote:
>> On Mon, Sep 01, 2014 at 02:23:17PM +0100, Tvrtko Ursulin wrote:
>> > 
>> > On 08/07/2014 02:20 PM, Chris Wilson wrote:
>> > >During release of the GEM object we hold the struct_mutex. As the
>> > >object may be holding onto the last reference for the task->mm,
>> > >calling mmput() may trigger exit_mmap() which close the vma
>> > >which will call drm_gem_vm_close() and attempt to reacquire
>> > >the struct_mutex. In order to avoid that recursion, we have
>> > >to defer the mmput() until after we drop the struct_mutex,
>> > >i.e. we need to schedule a worker to do the clean up. A further issue
>> > >spotted by Tvrtko was caused when we took a GTT mmapping of a userptr
>> > >buffer object. In that case, we would never call mmput as the object
>> > >would be cyclically referenced by the GTT mmapping and not freed upon
>> > >process exit - keeping the entire process mm alive after the process
>> > >task was reaped. The fix employed is to replace the mm_users/mmput()
>> > >reference handling to mm_count/mmdrop() for the shared i915_mm_struct.
>> > >
>> > >    INFO: task test_surfaces:1632 blocked for more than 120 seconds.
>> > >          Tainted: GF          O 3.14.5+ #1
>> > >    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> > >    test_surfaces   D 0000000000000000     0  1632   1590 0x00000082
>> > >     ffff88014914baa8 0000000000000046 0000000000000000 ffff88014914a010
>> > >     0000000000012c40 0000000000012c40 ffff8800a0058210 ffff88014784b010
>> > >     ffff88014914a010 ffff880037b1c820 ffff8800a0058210 ffff880037b1c824
>> > >    Call Trace:
>> > >     [<ffffffff81582499>] schedule+0x29/0x70
>> > >     [<ffffffff815825fe>] schedule_preempt_disabled+0xe/0x10
>> > >     [<ffffffff81583b93>] __mutex_lock_slowpath+0x183/0x220
>> > >     [<ffffffff81583c53>] mutex_lock+0x23/0x40
>> > >     [<ffffffffa005c2a3>] drm_gem_vm_close+0x33/0x70 [drm]
>> > >     [<ffffffff8115a483>] remove_vma+0x33/0x70
>> > >     [<ffffffff8115a5dc>] exit_mmap+0x11c/0x170
>> > >     [<ffffffff8104d6eb>] mmput+0x6b/0x100
>> > >     [<ffffffffa00f44b9>] i915_gem_userptr_release+0x89/0xc0 [i915]
>> > >     [<ffffffffa00e6706>] i915_gem_free_object+0x126/0x250 [i915]
>> > >     [<ffffffffa005c06a>] drm_gem_object_free+0x2a/0x40 [drm]
>> > >     [<ffffffffa005cc32>] drm_gem_object_handle_unreference_unlocked+0xe2/0x120 [drm]
>> > >     [<ffffffffa005ccd4>] drm_gem_object_release_handle+0x64/0x90 [drm]
>> > >     [<ffffffff8127ffeb>] idr_for_each+0xab/0x100
>> > >     [<ffffffffa005cc70>] ?  drm_gem_object_handle_unreference_unlocked+0x120/0x120 [drm]
>> > >     [<ffffffff81583c46>] ? mutex_lock+0x16/0x40
>> > >     [<ffffffffa005c354>] drm_gem_release+0x24/0x40 [drm]
>> > >     [<ffffffffa005b82b>] drm_release+0x3fb/0x480 [drm]
>> > >     [<ffffffff8118d482>] __fput+0xb2/0x260
>> > >     [<ffffffff8118d6de>] ____fput+0xe/0x10
>> > >     [<ffffffff8106f27f>] task_work_run+0x8f/0xf0
>> > >     [<ffffffff81052228>] do_exit+0x1a8/0x480
>> > >     [<ffffffff81052551>] do_group_exit+0x51/0xc0
>> > >     [<ffffffff810525d7>] SyS_exit_group+0x17/0x20
>> > >     [<ffffffff8158e092>] system_call_fastpath+0x16/0x1b
>> > >
>> > >v2: Incorporate feedback from Tvrtko and remove the unnessary mm
>> > >referencing when creating the i915_mm_struct and improve some of the
>> > >function names and comments.
>> > 
>> > [snip]
>> > 
>> > I reviewed this some weeks back and did not spot any issues.
>> > 
>> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Queued for -next, thanks for the patch. Aside: Is there other userptr
>> stuff outstanding? I've definitely lost track of them :(
>
> The one I posted with the extra r-b and t-b has a minor tweak to kill
> the unused error code during init. but otherwise this is the last
> outstanding patch.

This was referenced from [1], do we need a stable backport?

BR,
Jani.


[1] https://bugs.freedesktop.org/show_bug.cgi?id=80745


> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open Source Technology Center



More information about the Intel-gfx mailing list