[Intel-gfx] [PATCH 13/21] drm/i915: Pull i915_vma_pin under the vm->mutex

Mon Sep 16 11:10:17 UTC 2019

Quoting Tvrtko Ursulin (2019-09-16 11:13:23)
> 
> On 02/09/2019 05:02, Chris Wilson wrote:
> > Replace the struct_mutex requirement for pinning the i915_vma with the
> > local vm->mutex instead. Note that the vm->mutex is tainted by the
> > shrinker (we require unbinding from inside fs-reclaim) and so we cannot
> > allocate while holding that mutex. Instead we have to preallocate
> > workers to do allocate and apply the PTE updates after we have we
> > reserved their slot in the drm_mm (using fences to order the PTE writes
> > with the GPU work and with later unbind).
> 
> Can you put some paragraphs into the commit describing the 
> infrastructure changes? Like changes to active tracker at least.

You mean the addition of the preallocated node?

> Then commentary on effects on shrinker, fences, object management, 
> execbuf, basically all major parts of the code which have now been 
> fundamentally improved or changed. It would help with review. It's a 
> chunky patch/change and I think it needs some theory of operation text.

Off the top of my held, they are all just lock switches. The funky stuff
lies around avoiding introducing a kref for i915_vma and that requires a
whole bunch of trylocks to avoid double-frees. (I did plan on having a
kref here, but that leads to rewriting object page tracking which is in
its own quagmire.)

> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
> > index f99920652751..9e72b42a86f5 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
> > @@ -152,6 +152,17 @@ static void clear_pages_dma_fence_cb(struct dma_fence *fence,
> >       irq_work_queue(&w->irq_work);
> >   }
> >   
> > +static int move_to_active(struct i915_vma *vma, struct i915_request *rq)
> > +{
> > +     int err;
> > +
> > +     err = i915_request_await_active(rq, &vma->active);
> > +     if (err)
> > +             return err;
> > +
> > +     return i915_active_ref(&vma->active, rq->timeline, rq);
> > +}
> > +
> >   static void clear_pages_worker(struct work_struct *work)
> >   {
> >       struct clear_pages_work *w = container_of(work, typeof(*w), work);
> > @@ -211,7 +222,7 @@ static void clear_pages_worker(struct work_struct *work)
> >        * keep track of the GPU activity within this vma/request, and
> >        * propagate the signal from the request to w->dma.
> >        */
> > -     err = i915_active_ref(&vma->active, rq->timeline, rq);
> > +     err = move_to_active(vma, rq);
> 
> What is happening here? It wasn't sufficiently ordered before or 
> something changes with this patch?

We are introducing an independent ordering constraints. The clear page
worker is itself interwined with the obj->resv but one part of the
operation must be serialised with the asynchronous binding, and we don't
want that async binding to affect the object globally. So the clear-page
workers opencodes the awaits from i915_vma_move_to_active, but does not
want to partake in the fence export (as the operations is already in the
obj->resv).

> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index f1c0e5d958f3..653f7275306a 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -313,8 +313,6 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
> >       GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
> >   
> >       release_hw_id(ctx);
> > -     if (ctx->vm)
> > -             i915_vm_put(ctx->vm);
> 
> Contexts no longer hold a reference to vm? Only vmas?

Dropped earlier on close rather than final put...

> > @@ -379,9 +377,13 @@ void i915_gem_context_release(struct kref *ref)
> >   
> >   static void context_close(struct i915_gem_context *ctx)
> >   {
> > +     i915_gem_context_set_closed(ctx);
> > +
> > +     if (ctx->vm)
> > +             i915_vm_close(ctx->vm);
> 
> But now closed context mean closed vm, but what about shared vms? Is 
> open/close_vm now counted?

Yes. vm->open_count. This is all to avoid double-free of the vma from
object-close and vm-close. But closing earlier tidies up the vm release
and restores some earlier behaviour to avoid unbinding from a dead vm.

> > -                     ret = i915_vma_bind(vma, cache_level, PIN_UPDATE);
> > +                     /* Wait for an earlier async bind */
> > +                     ret = i915_active_wait(&vma->active);
> > +                     if (ret)
> > +                             return ret;
> > +
> > +                     ret = i915_vma_bind(vma, cache_level, PIN_UPDATE, NULL);
> 
> Waiting should not be implied in the bind? I am assuming at least there 
> will be many callers to bind and like this all of them will have to know 
> to do i915_active_wait first?

Not really, imo, i915_vma_bind() is a special case as it is the
low-level operation. And we only need the sync here as we plan to
rewrite those ongoing operations; in the other (non-UPDATE) caller, we
are adding a separate binding rather than overwriting existing.

> > @@ -483,6 +487,7 @@ static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj)
> >               if (!drm_mm_node_allocated(&vma->node))
> >                       continue;
> >   
> > +             GEM_BUG_ON(vma->vm != &i915->ggtt.vm);
> >               list_move_tail(&vma->vm_link, &vma->vm->bound_list);
> >       }
> >       mutex_unlock(&i915->ggtt.vm.mutex);
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index c049199a1df5..c3dd8ce7e2b7 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -550,8 +550,11 @@ eb_add_vma(struct i915_execbuffer *eb,
> >               eb_unreserve_vma(vma, vma->exec_flags);
> >   
> >               list_add_tail(&vma->exec_link, &eb->unbound);
> > -             if (drm_mm_node_allocated(&vma->node))
> > +             if (drm_mm_node_allocated(&vma->node)) {
> > +                     mutex_lock(&vma->vm->mutex);
> >                       err = i915_vma_unbind(vma);
> > +                     mutex_unlock(&vma->vm->mutex);
> 
> Should there be __i915_vma_unbind which asserts the lock and 
> i915_vma_unbind which takes it?

Could do, it seems to be an even split. I had a goal that
i915_vma_unbind() should evaporate; it should become a pipelined
operation for the most part, just the shrinker being always an exception
as our interface there is external.

> > +void i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f)
> > +{
> > +     GEM_BUG_ON(i915_active_is_idle(ref));
> > +
> > +     dma_fence_get(f);
> > +
> > +     rcu_read_lock();
> > +     if (rcu_access_pointer(ref->excl)) {
> > +             struct dma_fence *old;
> > +
> > +             old = dma_fence_get_rcu_safe(&ref->excl);
> > +             if (old) {
> > +                     if (dma_fence_remove_callback(old, &ref->excl_cb))
> > +                             atomic_dec(&ref->count);
> > +                     dma_fence_put(old);
> > +             }
> > +     }
> > +     rcu_read_unlock();
> > +
> > +     atomic_inc(&ref->count);
> > +     rcu_assign_pointer(ref->excl, f);
> > +
> > +     if (dma_fence_add_callback(f, &ref->excl_cb, excl_cb)) {
> > +             RCU_INIT_POINTER(ref->excl, NULL);
> > +             atomic_dec(&ref->count);
> > +             dma_fence_put(f);
> 
> Is this some silent failure path?

No, it's just a weird indication by dma-fence that the fence was already
signaled.

> >   int i915_request_await_active(struct i915_request *rq, struct i915_active *ref)
> >   {
> > -     struct active_node *it, *n;
> > -     int err;
> > -
> > -     if (RB_EMPTY_ROOT(&ref->tree))
> > -             return 0;
> > +     int err = 0;
> >   
> > -     /* await allocates and so we need to avoid hitting the shrinker */
> > -     err = i915_active_acquire(ref);
> > -     if (err)
> > -             return err;
> > +     if (rcu_access_pointer(ref->excl)) {
> > +             struct dma_fence *fence;
> >   
> > -     mutex_lock(&ref->mutex);
> > -     rbtree_postorder_for_each_entry_safe(it, n, &ref->tree, node) {
> > -             err = i915_request_await_active_request(rq, &it->base);
> > -             if (err)
> > -                     break;
> > +             rcu_read_lock();
> > +             fence = dma_fence_get_rcu_safe(&ref->excl);
> > +             rcu_read_unlock();
> > +             if (fence) {
> > +                     err = i915_request_await_dma_fence(rq, fence);
> > +                     dma_fence_put(fence);
> > +             }
> >       }
> > -     mutex_unlock(&ref->mutex);
> >   
> > -     i915_active_release(ref);
> > +     /* In the future we may choose to await on all fences */
> > +
> 
> This is completely changed to just wait on the "exclusive" fence? What 
> about the tree of nodes? It's still in the data structure and managed 
> elsewhere. Not needed any more until "future" from the comment?

It was all dead code. The use case I have for it atm is purely about
managing that exclusive timeline.
-Chris