[Intel-gfx] [PATCH v2] drm/i915: Track the last-active inside the i915_vma

Thu Jul 5 12:48:41 UTC 2018

Quoting Tvrtko Ursulin (2018-07-05 13:29:44)
> 
> On 05/07/2018 13:02, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-07-05 12:38:46)
> >>
> >> On 04/07/2018 09:34, Chris Wilson wrote:
> >>> Using a VMA on more than one timeline concurrently is the exception
> >>> rather than the rule (using it concurrently on multiple engines). As we
> >>> expect to only use one active tracker, store the most recently used
> >>> tracker inside the i915_vma itself and only fallback to the rbtree if
> >>> we need a second or more concurrent active trackers.
> >>>
> >>> v2: Comments on how we overwrite any existing last_active cache.
> >>>
> >>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_vma.c | 50 +++++++++++++++++++++++++++++++--
> >>>    drivers/gpu/drm/i915/i915_vma.h |  1 +
> >>>    2 files changed, 49 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> >>> index cd94ffc7f079..33925e00f7e8 100644
> >>> --- a/drivers/gpu/drm/i915/i915_vma.c
> >>> +++ b/drivers/gpu/drm/i915/i915_vma.c
> >>> @@ -119,6 +119,12 @@ i915_vma_retire(struct i915_gem_active *base, struct i915_request *rq)
> >>>        __i915_vma_retire(active->vma, rq);
> >>>    }
> >>>    
> >>> +static void
> >>> +i915_vma_last_retire(struct i915_gem_active *base, struct i915_request *rq)
> >>> +{
> >>> +     __i915_vma_retire(container_of(base, struct i915_vma, last_active), rq);
> >>> +}
> >>> +
> >>>    static struct i915_vma *
> >>>    vma_create(struct drm_i915_gem_object *obj,
> >>>           struct i915_address_space *vm,
> >>> @@ -136,6 +142,7 @@ vma_create(struct drm_i915_gem_object *obj,
> >>>    
> >>>        vma->active = RB_ROOT;
> >>>    
> >>> +     init_request_active(&vma->last_active, i915_vma_last_retire);
> >>>        init_request_active(&vma->last_fence, NULL);
> >>>        vma->vm = vm;
> >>>        vma->ops = &vm->vma_ops;
> >>> @@ -895,6 +902,22 @@ static struct i915_gem_active *lookup_active(struct i915_vma *vma, u64 idx)
> >>>    {
> >>>        struct i915_vma_active *active;
> >>>        struct rb_node **p, *parent;
> >>> +     struct i915_request *old;
> >>> +
> >>> +     /*
> >>> +      * We track the most recently used timeline to skip a rbtree search
> >>> +      * for the common case, under typical loads we never need the rbtree
> >>> +      * at all. We can reuse the last_active slot if it is empty, that is
> >>> +      * after the previous activity has been retired, or if the active
> >>> +      * matches the current timeline.
> >>> +      */
> >>> +     old = i915_gem_active_raw(&vma->last_active,
> >>> +                               &vma->vm->i915->drm.struct_mutex);
> >>> +     if (!old || old->fence.context == idx)
> >>> +             goto out;
> >>
> >> Is the situation that retire can be out of order relative to
> >> move_to_active? In other words, last_active can retire before the rbtree
> >> record, and so the following new move_to_active will find last_active
> >> empty and so could create a double entry for the same timeline?
> > 
> > We don't mind a double entry, and do expect that last_active and the
> > rbtree entry will still be active, tracking different requests.
> 
> Maybe I mind double entries, if avoiding them would make the code easier 
> to understand. :) Or not that we don't mind, but need them? Different 
> requests you say, but on same timeline or?

Same timeline, we are indexing by timeline after all. Having the double
entry avoids having to search the rbtree for every new request, and even
the radixtree lookup was a noticeable wart in the profiles.

Ok, not every new request, only on changing, do you then need to lookup
the new request to promote it to cached, but still need to insert the old
cached request. That's more code just to avoid the isset() replacement.
-Chris