[Intel-gfx] [PATCH v6 16/64] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5.

Thomas Hellström (Intel) thomas_os at shipmail.org
Mon Jan 18 12:55:30 UTC 2021


On 1/18/21 1:43 PM, Maarten Lankhorst wrote:
> Op 18-01-2021 om 12:30 schreef Thomas Hellström (Intel):
>> Hi,
>>
>> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>>> Instead of doing what we do currently, which will never work with
>>> PROVE_LOCKING, do the same as AMD does, and something similar to
>>> relocation slowpath. When all locks are dropped, we acquire the
>>> pages for pinning. When the locks are taken, we transfer those
>>> pages in .get_pages() to the bo. As a final check before installing
>>> the fences, we ensure that the mmu notifier was not called; if it is,
>>> we return -EAGAIN to userspace to signal it has to start over.
>>>
>>> Changes since v1:
>>> - Unbinding is done in submit_init only. submit_begin() removed.
>>> - MMU_NOTFIER -> MMU_NOTIFIER
>>> Changes since v2:
>>> - Make i915->mm.notifier a spinlock.
>>> Changes since v3:
>>> - Add WARN_ON if there are any page references left, should have been 0.
>>> - Return 0 on success in submit_init(), bug from spinlock conversion.
>>> - Release pvec outside of notifier_lock (Thomas).
>>> Changes since v4:
>>> - Mention why we're clearing eb->[i + 1].vma in the code. (Thomas)
>>> - Actually check all invalidations in eb_move_to_gpu. (Thomas)
>>> - Do not wait when process is exiting to fix gem_ctx_persistence.userptr.
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>>
>> ...
>>
>>>    -static int
>>> -userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
>>> -                  const struct mmu_notifier_range *range)
>>> -{
>>> -    struct i915_mmu_notifier *mn =
>>> -        container_of(_mn, struct i915_mmu_notifier, mn);
>>> -    struct interval_tree_node *it;
>>> -    unsigned long end;
>>> -    int ret = 0;
>>> -
>>> -    if (RB_EMPTY_ROOT(&mn->objects.rb_root))
>>> -        return 0;
>>> -
>>> -    /* interval ranges are inclusive, but invalidate range is exclusive */
>>> -    end = range->end - 1;
>>> -
>>> -    spin_lock(&mn->lock);
>>> -    it = interval_tree_iter_first(&mn->objects, range->start, end);
>>> -    while (it) {
>>> -        struct drm_i915_gem_object *obj;
>>> -
>>> -        if (!mmu_notifier_range_blockable(range)) {
>>> -            ret = -EAGAIN;
>>> -            break;
>>> -        }
>>> +    spin_lock(&i915->mm.notifier_lock);
>>>    -        /*
>>> -         * The mmu_object is released late when destroying the
>>> -         * GEM object so it is entirely possible to gain a
>>> -         * reference on an object in the process of being freed
>>> -         * since our serialisation is via the spinlock and not
>>> -         * the struct_mutex - and consequently use it after it
>>> -         * is freed and then double free it. To prevent that
>>> -         * use-after-free we only acquire a reference on the
>>> -         * object if it is not in the process of being destroyed.
>>> -         */
>>> -        obj = container_of(it, struct i915_mmu_object, it)->obj;
>>> -        if (!kref_get_unless_zero(&obj->base.refcount)) {
>>> -            it = interval_tree_iter_next(it, range->start, end);
>>> -            continue;
>>> -        }
>>> -        spin_unlock(&mn->lock);
>>> +    mmu_interval_set_seq(mni, cur_seq);
>>>    -        ret = i915_gem_object_unbind(obj,
>>> -                         I915_GEM_OBJECT_UNBIND_ACTIVE |
>>> -                         I915_GEM_OBJECT_UNBIND_BARRIER);
>>> -        if (ret == 0)
>>> -            ret = __i915_gem_object_put_pages(obj);
>>> -        i915_gem_object_put(obj);
>>> -        if (ret)
>>> -            return ret;
>>> +    spin_unlock(&i915->mm.notifier_lock);
>>>    -        spin_lock(&mn->lock);
>>> +    /* During exit there's no need to wait */
>>> +    if (current->flags & PF_EXITING)
>>> +        return true;
>> Did we ever find out why this is needed, that is why the old userptr invalidation called doesn't hang here in a similar way?
> It's an optimization for teardown because userptr will be invalidated anyway, but also for gem_ctx_persistence.userptr, although
>
> with ulls that test may stop working anyway because it takes an out_fence.

Sure, but what I meant was: Did we find out what's different in the new 
code compared to the old one? Because the old code also waits for gpu 
when unbinding in the mmu_notifier, but it appears like in the old code, 
the mmu notifier is never called here. At least to me it seems it would 
be good if we understand what that difference is.

/Thomas




More information about the Intel-gfx mailing list