[Intel-gfx] [PATCH v6 16/64] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5.

Wed Jan 20 13:32:42 UTC 2021

On 1/18/21 1:55 PM, Thomas Hellström (Intel) wrote:
>
> On 1/18/21 1:43 PM, Maarten Lankhorst wrote:
>> Op 18-01-2021 om 12:30 schreef Thomas Hellström (Intel):
>>> Hi,
>>>
>>> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>>>> Instead of doing what we do currently, which will never work with
>>>> PROVE_LOCKING, do the same as AMD does, and something similar to
>>>> relocation slowpath. When all locks are dropped, we acquire the
>>>> pages for pinning. When the locks are taken, we transfer those
>>>> pages in .get_pages() to the bo. As a final check before installing
>>>> the fences, we ensure that the mmu notifier was not called; if it is,
>>>> we return -EAGAIN to userspace to signal it has to start over.
>>>>
>>>> Changes since v1:
>>>> - Unbinding is done in submit_init only. submit_begin() removed.
>>>> - MMU_NOTFIER -> MMU_NOTIFIER
>>>> Changes since v2:
>>>> - Make i915->mm.notifier a spinlock.
>>>> Changes since v3:
>>>> - Add WARN_ON if there are any page references left, should have 
>>>> been 0.
>>>> - Return 0 on success in submit_init(), bug from spinlock conversion.
>>>> - Release pvec outside of notifier_lock (Thomas).
>>>> Changes since v4:
>>>> - Mention why we're clearing eb->[i + 1].vma in the code. (Thomas)
>>>> - Actually check all invalidations in eb_move_to_gpu. (Thomas)
>>>> - Do not wait when process is exiting to fix 
>>>> gem_ctx_persistence.userptr.
>>>>
>>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>>>
>>> ...
>>>
>>>>    -static int
>>>> -userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
>>>> -                  const struct mmu_notifier_range *range)
>>>> -{
>>>> -    struct i915_mmu_notifier *mn =
>>>> -        container_of(_mn, struct i915_mmu_notifier, mn);
>>>> -    struct interval_tree_node *it;
>>>> -    unsigned long end;
>>>> -    int ret = 0;
>>>> -
>>>> -    if (RB_EMPTY_ROOT(&mn->objects.rb_root))
>>>> -        return 0;
>>>> -
>>>> -    /* interval ranges are inclusive, but invalidate range is 
>>>> exclusive */
>>>> -    end = range->end - 1;
>>>> -
>>>> -    spin_lock(&mn->lock);
>>>> -    it = interval_tree_iter_first(&mn->objects, range->start, end);
>>>> -    while (it) {
>>>> -        struct drm_i915_gem_object *obj;
>>>> -
>>>> -        if (!mmu_notifier_range_blockable(range)) {
>>>> -            ret = -EAGAIN;
>>>> -            break;
>>>> -        }
>>>> +    spin_lock(&i915->mm.notifier_lock);
>>>>    -        /*
>>>> -         * The mmu_object is released late when destroying the
>>>> -         * GEM object so it is entirely possible to gain a
>>>> -         * reference on an object in the process of being freed
>>>> -         * since our serialisation is via the spinlock and not
>>>> -         * the struct_mutex - and consequently use it after it
>>>> -         * is freed and then double free it. To prevent that
>>>> -         * use-after-free we only acquire a reference on the
>>>> -         * object if it is not in the process of being destroyed.
>>>> -         */
>>>> -        obj = container_of(it, struct i915_mmu_object, it)->obj;
>>>> -        if (!kref_get_unless_zero(&obj->base.refcount)) {
>>>> -            it = interval_tree_iter_next(it, range->start, end);
>>>> -            continue;
>>>> -        }
>>>> -        spin_unlock(&mn->lock);
>>>> +    mmu_interval_set_seq(mni, cur_seq);
>>>>    -        ret = i915_gem_object_unbind(obj,
>>>> -                         I915_GEM_OBJECT_UNBIND_ACTIVE |
>>>> -                         I915_GEM_OBJECT_UNBIND_BARRIER);
>>>> -        if (ret == 0)
>>>> -            ret = __i915_gem_object_put_pages(obj);
>>>> -        i915_gem_object_put(obj);
>>>> -        if (ret)
>>>> -            return ret;
>>>> +    spin_unlock(&i915->mm.notifier_lock);
>>>>    -        spin_lock(&mn->lock);
>>>> +    /* During exit there's no need to wait */
>>>> +    if (current->flags & PF_EXITING)
>>>> +        return true;
>>> Did we ever find out why this is needed, that is why the old userptr 
>>> invalidation called doesn't hang here in a similar way?
>> It's an optimization for teardown because userptr will be invalidated 
>> anyway, but also for gem_ctx_persistence.userptr, although
>>
>> with ulls that test may stop working anyway because it takes an 
>> out_fence.
>
> Sure, but what I meant was: Did we find out what's different in the 
> new code compared to the old one? Because the old code also waits for 
> gpu when unbinding in the mmu_notifier, but it appears like in the old 
> code, the mmu notifier is never called here. At least to me it seems 
> it would be good if we understand what that difference is.
>
> /Thomas
>
IIRC I did some investigation here as well and from what I could tell, 
the notifier was not called at all for the old code. I don't really feel 
comfortable with an R-B until we've really understood why.

Also there was a discussion with you, Sudeep and Chris about whether 
user-space was actually not comfortable with this, you saying it worked, 
Chris said tests were showing otherwise. What were those tests?

Thanks,

Thomas