[Intel-gfx] [PATCH v4 15/61] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v4.

Thomas Hellström (Intel) thomas_os at shipmail.org
Mon Oct 19 07:52:51 UTC 2020


On 10/19/20 9:30 AM, Thomas Hellström (Intel) wrote:
>
> On 10/16/20 12:43 PM, Maarten Lankhorst wrote:
>> Instead of doing what we do currently, which will never work with
>> PROVE_LOCKING, do the same as AMD does, and something similar to
>> relocation slowpath. When all locks are dropped, we acquire the
>> pages for pinning. When the locks are taken, we transfer those
>> pages in .get_pages() to the bo. As a final check before installing
>> the fences, we ensure that the mmu notifier was not called; if it is,
>> we return -EAGAIN to userspace to signal it has to start over.
>>
>> Changes since v1:
>> - Unbinding is done in submit_init only. submit_begin() removed.
>> - MMU_NOTFIER -> MMU_NOTIFIER
>> Changes since v2:
>> - Make i915->mm.notifier a spinlock.
>> Changes since v3:
>> - Add WARN_ON if there are any page references left, should have been 0.
>> - Return 0 on success in submit_init(), bug from spinlock conversion.
>> - Release pvec outside of notifier_lock (Thomas).
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>
...
>> +static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier 
>> *mni,
>> +                    const struct mmu_notifier_range *range,
>> +                    unsigned long cur_seq)
>>   {
>> -    struct i915_mmu_notifier *mn =
>> -        container_of(_mn, struct i915_mmu_notifier, mn);
>> -    struct interval_tree_node *it;
>> -    unsigned long end;
>> -    int ret = 0;
>> -
>> -    if (RB_EMPTY_ROOT(&mn->objects.rb_root))
>> -        return 0;
>> -
>> -    /* interval ranges are inclusive, but invalidate range is 
>> exclusive */
>> -    end = range->end - 1;
>> -
>> -    spin_lock(&mn->lock);
>> -    it = interval_tree_iter_first(&mn->objects, range->start, end);
>> -    while (it) {
>> -        struct drm_i915_gem_object *obj;
>> -
>> -        if (!mmu_notifier_range_blockable(range)) {
>> -            ret = -EAGAIN;
>> -            break;
>> -        }
>> +    struct drm_i915_gem_object *obj = container_of(mni, struct 
>> drm_i915_gem_object, userptr.notifier);
>> +    struct drm_i915_private *i915 = to_i915(obj->base.dev);
>> +    long r;
>>   -        /*
>> -         * The mmu_object is released late when destroying the
>> -         * GEM object so it is entirely possible to gain a
>> -         * reference on an object in the process of being freed
>> -         * since our serialisation is via the spinlock and not
>> -         * the struct_mutex - and consequently use it after it
>> -         * is freed and then double free it. To prevent that
>> -         * use-after-free we only acquire a reference on the
>> -         * object if it is not in the process of being destroyed.
>> -         */
>> -        obj = container_of(it, struct i915_mmu_object, it)->obj;
>> -        if (!kref_get_unless_zero(&obj->base.refcount)) {
>> -            it = interval_tree_iter_next(it, range->start, end);
>> -            continue;
>> -        }
>> -        spin_unlock(&mn->lock);
>> +    if (!mmu_notifier_range_blockable(range))
>> +        return false;
>>   -        ret = i915_gem_object_unbind(obj,
>> -                         I915_GEM_OBJECT_UNBIND_ACTIVE |
>> -                         I915_GEM_OBJECT_UNBIND_BARRIER);
>> -        if (ret == 0)
>> -            ret = __i915_gem_object_put_pages(obj);
>> -        i915_gem_object_put(obj);
>> -        if (ret)
>> -            return ret;
>
> If we add an additional fence wait here before setting the seq (which 
> takes the invalidation write seqlock), we'd reduce (but definitely not 
> eliminate) the chance of waiting inside the invalidation write 
> seqlock, which could trigger a wait in submit_init until the 
> write_seqlock is released. That would make the new userptr 
> invalidation similarly unlikely to trigger a wait in the command 
> submission path as the previous userptr invalidation.

Hmm. Scratch that, that would only actually postpone a wait during 
command submission to the next submission since we'd be more likely to 
install yet another fence.

/Thomas




More information about the Intel-gfx mailing list