[Intel-gfx] [PATCH v6 16/64] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5.

Maarten Lankhorst maarten.lankhorst at linux.intel.com
Mon Jan 18 12:43:03 UTC 2021


Op 18-01-2021 om 12:30 schreef Thomas Hellström (Intel):
> Hi,
>
> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>> Instead of doing what we do currently, which will never work with
>> PROVE_LOCKING, do the same as AMD does, and something similar to
>> relocation slowpath. When all locks are dropped, we acquire the
>> pages for pinning. When the locks are taken, we transfer those
>> pages in .get_pages() to the bo. As a final check before installing
>> the fences, we ensure that the mmu notifier was not called; if it is,
>> we return -EAGAIN to userspace to signal it has to start over.
>>
>> Changes since v1:
>> - Unbinding is done in submit_init only. submit_begin() removed.
>> - MMU_NOTFIER -> MMU_NOTIFIER
>> Changes since v2:
>> - Make i915->mm.notifier a spinlock.
>> Changes since v3:
>> - Add WARN_ON if there are any page references left, should have been 0.
>> - Return 0 on success in submit_init(), bug from spinlock conversion.
>> - Release pvec outside of notifier_lock (Thomas).
>> Changes since v4:
>> - Mention why we're clearing eb->[i + 1].vma in the code. (Thomas)
>> - Actually check all invalidations in eb_move_to_gpu. (Thomas)
>> - Do not wait when process is exiting to fix gem_ctx_persistence.userptr.
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>
>
> ...
>
>>   -static int
>> -userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
>> -                  const struct mmu_notifier_range *range)
>> -{
>> -    struct i915_mmu_notifier *mn =
>> -        container_of(_mn, struct i915_mmu_notifier, mn);
>> -    struct interval_tree_node *it;
>> -    unsigned long end;
>> -    int ret = 0;
>> -
>> -    if (RB_EMPTY_ROOT(&mn->objects.rb_root))
>> -        return 0;
>> -
>> -    /* interval ranges are inclusive, but invalidate range is exclusive */
>> -    end = range->end - 1;
>> -
>> -    spin_lock(&mn->lock);
>> -    it = interval_tree_iter_first(&mn->objects, range->start, end);
>> -    while (it) {
>> -        struct drm_i915_gem_object *obj;
>> -
>> -        if (!mmu_notifier_range_blockable(range)) {
>> -            ret = -EAGAIN;
>> -            break;
>> -        }
>> +    spin_lock(&i915->mm.notifier_lock);
>>   -        /*
>> -         * The mmu_object is released late when destroying the
>> -         * GEM object so it is entirely possible to gain a
>> -         * reference on an object in the process of being freed
>> -         * since our serialisation is via the spinlock and not
>> -         * the struct_mutex - and consequently use it after it
>> -         * is freed and then double free it. To prevent that
>> -         * use-after-free we only acquire a reference on the
>> -         * object if it is not in the process of being destroyed.
>> -         */
>> -        obj = container_of(it, struct i915_mmu_object, it)->obj;
>> -        if (!kref_get_unless_zero(&obj->base.refcount)) {
>> -            it = interval_tree_iter_next(it, range->start, end);
>> -            continue;
>> -        }
>> -        spin_unlock(&mn->lock);
>> +    mmu_interval_set_seq(mni, cur_seq);
>>   -        ret = i915_gem_object_unbind(obj,
>> -                         I915_GEM_OBJECT_UNBIND_ACTIVE |
>> -                         I915_GEM_OBJECT_UNBIND_BARRIER);
>> -        if (ret == 0)
>> -            ret = __i915_gem_object_put_pages(obj);
>> -        i915_gem_object_put(obj);
>> -        if (ret)
>> -            return ret;
>> +    spin_unlock(&i915->mm.notifier_lock);
>>   -        spin_lock(&mn->lock);
>> +    /* During exit there's no need to wait */
>> +    if (current->flags & PF_EXITING)
>> +        return true;
>
> Did we ever find out why this is needed, that is why the old userptr invalidation called doesn't hang here in a similar way? 

It's an optimization for teardown because userptr will be invalidated anyway, but also for gem_ctx_persistence.userptr, although

with ulls that test may stop working anyway because it takes an out_fence.



More information about the Intel-gfx mailing list