[Intel-gfx] [PATCH 2/2] drm/i915: Prevent recursive deadlock on releasing a busy userptr
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Tue Jul 29 11:39:55 CEST 2014
On 07/23/2014 06:15 PM, Chris Wilson wrote:
> On Wed, Jul 23, 2014 at 05:39:49PM +0100, Tvrtko Ursulin wrote:
>> On 07/21/2014 01:21 PM, Chris Wilson wrote:
>>> + mn = i915_mmu_notifier_get(obj->userptr.mm);
>>> + if (IS_ERR(mn))
>>> + return PTR_ERR(mn);
>>
>> Very minor, but I would perhaps consider renaming this to _find
>> since _get in my mind strongly associates with reference counting
>> and this does not do that. Especially if the reviewer looks at the
>> bail out below and sees no matching put. But minor as I said, you
>> can judge what you prefer.
>
> The same. It was _get because it did used to a reference counter, now
> that counting has been removed from the i915_mmu_notifier.
>
>>> +static int
>>> +i915_gem_userptr_init__mm_struct(struct drm_i915_gem_object *obj)
>>> +{
>>> + struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
>>> + struct i915_mm_struct *mm;
>>> + struct mm_struct *real;
>>> + int ret = 0;
>>> +
>>> + real = get_task_mm(current);
>>> + if (real == NULL)
>>> + return -EINVAL;
>>
>> Do you think we need get_task_mm()/mmput() here, given it is all
>> inside a single system call?
>
> No. I kept using get_task_mm() simply because it looked neater than
> current->mm, but current->mm looks like it gives simpler code.
>
>>> + /* During release of the GEM object we hold the struct_mutex. As the
>>> + * object may be holding onto the last reference for the task->mm,
>>> + * calling mmput() may trigger exit_mmap() which close the vma
>>> + * which will call drm_gem_vm_close() and attempt to reacquire
>>> + * the struct_mutex. In order to avoid that recursion, we have
>>> + * to defer the mmput() until after we drop the struct_mutex,
>>> + * i.e. we need to schedule a worker to do the clean up.
>>> + */
>>
>> This comment reads like a strange mixture and past and present eg.
>> what used to be the case and what is the fix. We don't hold a
>> reference to the process mm as the address space (terminology OK?).
>> We do hold a reference to the mm struct itself - which is enough to
>> unregister the notifiers, correct?
>
> True. I was more or less trying to explain the bug and that comment
> ended up being the changelog entry. It doesn't work well as a comment.
>
> + /* During release of the GEM object we hold the struct_mutex. This
> + * precludes us from calling mmput() at that time as that may be
> + * the last reference and so call exit_mmap(). exit_mmap() will
> + * attempt to reap the vma, and if we were holding a GTT mmap
> + * would then call drm_gem_vm_close() and attempt to reacquire
> + * the struct mutex. So in order to avoid that recursion, we have
> + * to defer releasing the mm reference until after we drop the
> + * struct_mutex, i.e. we need to schedule a worker to do the clean
> + * up.
Sounds good, just saying really to remind you to post a respin. :)
Regards,
Tvrtko
More information about the Intel-gfx
mailing list