[Intel-gfx] [PATCH v2] drm/i915: Track the last-active inside the i915_vma
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Wed Jul 4 12:30:39 UTC 2018
On 04/07/2018 12:47, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-04 12:34:04)
>>
>> On 04/07/2018 10:39, Tvrtko Ursulin wrote:
>>>
>>> On 04/07/2018 09:34, Chris Wilson wrote:
>>>> Using a VMA on more than one timeline concurrently is the exception
>>>> rather than the rule (using it concurrently on multiple engines). As we
>>>> expect to only use one active tracker, store the most recently used
>>>> tracker inside the i915_vma itself and only fallback to the rbtree if
>>>> we need a second or more concurrent active trackers.
>>>>
>>>> v2: Comments on how we overwrite any existing last_active cache.
>>>>
>>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/i915_vma.c | 50 +++++++++++++++++++++++++++++++--
>>>> drivers/gpu/drm/i915/i915_vma.h | 1 +
>>>> 2 files changed, 49 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_vma.c
>>>> b/drivers/gpu/drm/i915/i915_vma.c
>>>> index cd94ffc7f079..33925e00f7e8 100644
>>>> --- a/drivers/gpu/drm/i915/i915_vma.c
>>>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>>>> @@ -119,6 +119,12 @@ i915_vma_retire(struct i915_gem_active *base,
>>>> struct i915_request *rq)
>>>> __i915_vma_retire(active->vma, rq);
>>>> }
>>>> +static void
>>>> +i915_vma_last_retire(struct i915_gem_active *base, struct
>>>> i915_request *rq)
>>>> +{
>>>> + __i915_vma_retire(container_of(base, struct i915_vma,
>>>> last_active), rq);
>>>> +}
>>>> +
>>>> static struct i915_vma *
>>>> vma_create(struct drm_i915_gem_object *obj,
>>>> struct i915_address_space *vm,
>>>> @@ -136,6 +142,7 @@ vma_create(struct drm_i915_gem_object *obj,
>>>> vma->active = RB_ROOT;
>>>> + init_request_active(&vma->last_active, i915_vma_last_retire);
>>>> init_request_active(&vma->last_fence, NULL);
>>>> vma->vm = vm;
>>>> vma->ops = &vm->vma_ops;
>>>> @@ -895,6 +902,22 @@ static struct i915_gem_active
>>>> *lookup_active(struct i915_vma *vma, u64 idx)
>>>> {
>>>> struct i915_vma_active *active;
>>>> struct rb_node **p, *parent;
>>>> + struct i915_request *old;
>>>> +
>>>> + /*
>>>> + * We track the most recently used timeline to skip a rbtree search
>>>> + * for the common case, under typical loads we never need the rbtree
>>>> + * at all. We can reuse the last_active slot if it is empty, that is
>>>> + * after the previous activity has been retired, or if the active
>>>> + * matches the current timeline.
>>>> + */
>>>> + old = i915_gem_active_raw(&vma->last_active,
>>>> + &vma->vm->i915->drm.struct_mutex);
>>>> + if (!old || old->fence.context == idx)
>>>> + goto out;
>>>> +
>>>> + /* Move the currently active fence into the rbtree */
>>>> + idx = old->fence.context;
>>>> parent = NULL;
>>>> p = &vma->active.rb_node;
>>>> @@ -903,7 +926,7 @@ static struct i915_gem_active
>>>> *lookup_active(struct i915_vma *vma, u64 idx)
>>>> active = rb_entry(parent, struct i915_vma_active, node);
>>>> if (active->timeline == idx)
>>>> - return &active->base;
>>>> + goto replace;
>>>> if (active->timeline < idx)
>>>> p = &parent->rb_right;
>>>> @@ -922,7 +945,25 @@ static struct i915_gem_active
>>>> *lookup_active(struct i915_vma *vma, u64 idx)
>>>> rb_link_node(&active->node, parent, p);
>>>> rb_insert_color(&active->node, &vma->active);
>>>> - return &active->base;
>>>> +replace:
>>>> + /*
>>>> + * Overwrite the previous active slot in the rbtree with
>>>> last_active,
>>>> + * leaving last_active zeroed. If the previous slot is still active,
>>>> + * we must be careful as we now only expect to recieve one retire
>>>
>>> typo in receive
>>>
>>>> + * callback not two, and so much undo the active counting for the
>>>> + * overwritten slot.
>>>> + */
>>>> + if (i915_gem_active_isset(&active->base)) {
>>>> + __list_del_entry(&active->base.link);
>>>> + vma->active_count--;
>>> > + GEM_BUG_ON(!vma->active_count);
>>>
>>> I still don't get this. The cache is exclusive, so when transferring a
>>> record from rbtree to last_active, why do we need to decrement the
>>> vma->active_count here? Don't get the part in the comment about two
>>> retires - do you really sometimes expect two - ie cache is not exclusive?
>>>
>>> But the fact that lookup of a cached entry is a straight return, meaning
>>> vma->active_count is manipulated elsewhere, makes me think it is
>>> avoidable messing with it on this path as well.
>>>
>>> Maybe the separation of duties between the callers and this function
>>> needs to be stronger.
>>
>> Hmm or is your cache actually inclusive? Don't see no rbtree
>> manipulation on migration to and from last_active/rbtree..
>
> Both. Inclusive in the sense that both last_active and its timeline slot
> in the rbtree may be active tracking different requests and so receive
> retirement callbacks independently. Exclusive in that we don't store
> last_active in the cache slot and in the rbtree.
I don't know how to reconcile between two sentences (statements). They
seem in contradiction. :(
>> And since rbtree lookup is always for the last_active context id, you
>> would otherwise never hit the the "goto replace" path.
>>
>> How do you ever look up an id which is not cached in last_active then?
>
> We don't. We only lookup on evicting a still active request from
> last_active. The MRU recent request always goes into last_active.
Sorry lost, another day maybe.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list