[Intel-gfx] [PATCH v2] drm/i915: Track the last-active inside the i915_vma

Wed Jul 4 11:34:04 UTC 2018

On 04/07/2018 10:39, Tvrtko Ursulin wrote:
> 
> On 04/07/2018 09:34, Chris Wilson wrote:
>> Using a VMA on more than one timeline concurrently is the exception
>> rather than the rule (using it concurrently on multiple engines). As we
>> expect to only use one active tracker, store the most recently used
>> tracker inside the i915_vma itself and only fallback to the rbtree if
>> we need a second or more concurrent active trackers.
>>
>> v2: Comments on how we overwrite any existing last_active cache.
>>
>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_vma.c | 50 +++++++++++++++++++++++++++++++--
>>   drivers/gpu/drm/i915/i915_vma.h |  1 +
>>   2 files changed, 49 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c 
>> b/drivers/gpu/drm/i915/i915_vma.c
>> index cd94ffc7f079..33925e00f7e8 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -119,6 +119,12 @@ i915_vma_retire(struct i915_gem_active *base, 
>> struct i915_request *rq)
>>       __i915_vma_retire(active->vma, rq);
>>   }
>> +static void
>> +i915_vma_last_retire(struct i915_gem_active *base, struct 
>> i915_request *rq)
>> +{
>> +    __i915_vma_retire(container_of(base, struct i915_vma, 
>> last_active), rq);
>> +}
>> +
>>   static struct i915_vma *
>>   vma_create(struct drm_i915_gem_object *obj,
>>          struct i915_address_space *vm,
>> @@ -136,6 +142,7 @@ vma_create(struct drm_i915_gem_object *obj,
>>       vma->active = RB_ROOT;
>> +    init_request_active(&vma->last_active, i915_vma_last_retire);
>>       init_request_active(&vma->last_fence, NULL);
>>       vma->vm = vm;
>>       vma->ops = &vm->vma_ops;
>> @@ -895,6 +902,22 @@ static struct i915_gem_active 
>> *lookup_active(struct i915_vma *vma, u64 idx)
>>   {
>>       struct i915_vma_active *active;
>>       struct rb_node **p, *parent;
>> +    struct i915_request *old;
>> +
>> +    /*
>> +     * We track the most recently used timeline to skip a rbtree search
>> +     * for the common case, under typical loads we never need the rbtree
>> +     * at all. We can reuse the last_active slot if it is empty, that is
>> +     * after the previous activity has been retired, or if the active
>> +     * matches the current timeline.
>> +     */
>> +    old = i915_gem_active_raw(&vma->last_active,
>> +                  &vma->vm->i915->drm.struct_mutex);
>> +    if (!old || old->fence.context == idx)
>> +        goto out;
>> +
>> +    /* Move the currently active fence into the rbtree */
>> +    idx = old->fence.context;
>>       parent = NULL;
>>       p = &vma->active.rb_node;
>> @@ -903,7 +926,7 @@ static struct i915_gem_active 
>> *lookup_active(struct i915_vma *vma, u64 idx)
>>           active = rb_entry(parent, struct i915_vma_active, node);
>>           if (active->timeline == idx)
>> -            return &active->base;
>> +            goto replace;
>>           if (active->timeline < idx)
>>               p = &parent->rb_right;
>> @@ -922,7 +945,25 @@ static struct i915_gem_active 
>> *lookup_active(struct i915_vma *vma, u64 idx)
>>       rb_link_node(&active->node, parent, p);
>>       rb_insert_color(&active->node, &vma->active);
>> -    return &active->base;
>> +replace:
>> +    /*
>> +     * Overwrite the previous active slot in the rbtree with 
>> last_active,
>> +     * leaving last_active zeroed. If the previous slot is still active,
>> +     * we must be careful as we now only expect to recieve one retire
> 
> typo in receive
> 
>> +     * callback not two, and so much undo the active counting for the
>> +     * overwritten slot.
>> +     */
>> +    if (i915_gem_active_isset(&active->base)) {
>> +        __list_del_entry(&active->base.link);
>> +        vma->active_count--;
>  > +        GEM_BUG_ON(!vma->active_count);
> 
> I still don't get this. The cache is exclusive, so when transferring a 
> record from rbtree to last_active, why do we need to decrement the 
> vma->active_count here? Don't get the part in the comment about two 
> retires - do you really sometimes expect two - ie cache is not exclusive?
> 
> But the fact that lookup of a cached entry is a straight return, meaning 
> vma->active_count is manipulated elsewhere, makes me think it is 
> avoidable messing with it on this path as well.
> 
> Maybe the separation of duties between the callers and this function 
> needs to be stronger.

Hmm or is your cache actually inclusive? Don't see no rbtree 
manipulation on migration to and from last_active/rbtree..

And since rbtree lookup is always for the last_active context id, you 
would otherwise never hit the the "goto replace" path.

How do you ever look up an id which is not cached in last_active then?

I am thoroughly confused now..

Regards,

Tvrtko