[Intel-gfx] [PATCH v4 15/61] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v4.
Thomas Hellström (Intel)
thomas_os at shipmail.org
Mon Oct 19 07:52:51 UTC 2020
On 10/19/20 9:30 AM, Thomas Hellström (Intel) wrote:
>
> On 10/16/20 12:43 PM, Maarten Lankhorst wrote:
>> Instead of doing what we do currently, which will never work with
>> PROVE_LOCKING, do the same as AMD does, and something similar to
>> relocation slowpath. When all locks are dropped, we acquire the
>> pages for pinning. When the locks are taken, we transfer those
>> pages in .get_pages() to the bo. As a final check before installing
>> the fences, we ensure that the mmu notifier was not called; if it is,
>> we return -EAGAIN to userspace to signal it has to start over.
>>
>> Changes since v1:
>> - Unbinding is done in submit_init only. submit_begin() removed.
>> - MMU_NOTFIER -> MMU_NOTIFIER
>> Changes since v2:
>> - Make i915->mm.notifier a spinlock.
>> Changes since v3:
>> - Add WARN_ON if there are any page references left, should have been 0.
>> - Return 0 on success in submit_init(), bug from spinlock conversion.
>> - Release pvec outside of notifier_lock (Thomas).
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>
...
>> +static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier
>> *mni,
>> + const struct mmu_notifier_range *range,
>> + unsigned long cur_seq)
>> {
>> - struct i915_mmu_notifier *mn =
>> - container_of(_mn, struct i915_mmu_notifier, mn);
>> - struct interval_tree_node *it;
>> - unsigned long end;
>> - int ret = 0;
>> -
>> - if (RB_EMPTY_ROOT(&mn->objects.rb_root))
>> - return 0;
>> -
>> - /* interval ranges are inclusive, but invalidate range is
>> exclusive */
>> - end = range->end - 1;
>> -
>> - spin_lock(&mn->lock);
>> - it = interval_tree_iter_first(&mn->objects, range->start, end);
>> - while (it) {
>> - struct drm_i915_gem_object *obj;
>> -
>> - if (!mmu_notifier_range_blockable(range)) {
>> - ret = -EAGAIN;
>> - break;
>> - }
>> + struct drm_i915_gem_object *obj = container_of(mni, struct
>> drm_i915_gem_object, userptr.notifier);
>> + struct drm_i915_private *i915 = to_i915(obj->base.dev);
>> + long r;
>> - /*
>> - * The mmu_object is released late when destroying the
>> - * GEM object so it is entirely possible to gain a
>> - * reference on an object in the process of being freed
>> - * since our serialisation is via the spinlock and not
>> - * the struct_mutex - and consequently use it after it
>> - * is freed and then double free it. To prevent that
>> - * use-after-free we only acquire a reference on the
>> - * object if it is not in the process of being destroyed.
>> - */
>> - obj = container_of(it, struct i915_mmu_object, it)->obj;
>> - if (!kref_get_unless_zero(&obj->base.refcount)) {
>> - it = interval_tree_iter_next(it, range->start, end);
>> - continue;
>> - }
>> - spin_unlock(&mn->lock);
>> + if (!mmu_notifier_range_blockable(range))
>> + return false;
>> - ret = i915_gem_object_unbind(obj,
>> - I915_GEM_OBJECT_UNBIND_ACTIVE |
>> - I915_GEM_OBJECT_UNBIND_BARRIER);
>> - if (ret == 0)
>> - ret = __i915_gem_object_put_pages(obj);
>> - i915_gem_object_put(obj);
>> - if (ret)
>> - return ret;
>
> If we add an additional fence wait here before setting the seq (which
> takes the invalidation write seqlock), we'd reduce (but definitely not
> eliminate) the chance of waiting inside the invalidation write
> seqlock, which could trigger a wait in submit_init until the
> write_seqlock is released. That would make the new userptr
> invalidation similarly unlikely to trigger a wait in the command
> submission path as the previous userptr invalidation.
Hmm. Scratch that, that would only actually postpone a wait during
command submission to the next submission since we'd be more likely to
install yet another fence.
/Thomas
More information about the Intel-gfx
mailing list