[Intel-gfx] [PATCH 5/6] drm/i915: Track vma activity per fence.context, not per engine

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Jul 4 09:43:07 UTC 2018


On 03/07/2018 21:29, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-07-03 18:28:31)
>>
>> On 29/06/2018 23:54, Chris Wilson wrote:
>>> In the next patch, we will want to be able to use more flexible request
>>> timelines that can hop between engines. From the vma pov, we can then
>>> not rely on the binding of this request to an engine and so can not
>>> ensure that different requests are ordered through a per-engine
>>> timeline, and so we must track activity of all timelines. (We track
>>> activity on the vma itself to prevent unbinding from HW before the HW
>>> has finished accessing it.)
>>>
>>> v2: Switch to a rbtree for 32b safety (since using u64 as a radixtree
>>> index is fraught with aliasing of unsigned longs).

[snip]

>>> +struct i915_vma_active {
>>> +     struct i915_gem_active base;
>>> +     struct i915_vma *vma;
>>> +     struct rb_node node;
>>> +     u64 timeline;
>>
>> If my quick calculations are correct this is (8 + 16 + 8) + 8 + 20 + 8 =
>> 68 large - just unluckily over the 64-byte slab so at some point I think
>> it will warrant a dedicated slab to avoid wastage.
> 
> Hmm, isn't it 7 pointers + a u64.
> 
>   sizeof(i915_vma_active)=72
>   offsetofend(i915_vma_active.base)=32
>   offsetofend(i915_vma_active.vma)=40
>   offsetofend(i915_vma_active.node)=64
>   offsetofend(i915_vma_active.timeline)=72
> 
> Bah i915_gem_active is bigger than I remember.

So goes into the 96-byte bucket and waste is 24 bytes per entry. I 
thought there is only 128 bucket and 128 minus my incorrect 68 was much 
more. So OK, can leave it for later optimisation.

[snip]

>>>        /*
>>>         * Add a reference if we're newly entering the active list.
>>>         * The order in which we add operations to the retirement queue is
>>> @@ -896,11 +947,13 @@ int i915_vma_move_to_active(struct i915_vma *vma,
>>>         * add the active reference first and queue for it to be dropped
>>>         * *last*.
>>>         */
>>> -     if (!i915_vma_is_active(vma))
>>> +     if (!i915_gem_active_isset(active) && !vma->active_count++) {
>>
>> vma->active_count (which is i915_vma_is_active) check wouldn't be
>> enough? Can it be zero with active _set_?
> 
> No, in the next patch we can have active_count with vma->active unset.

Definitely then please make it so it is only what is needed for this 
patch, and change it in the next.

Regards,

Tvrtko


More information about the Intel-gfx mailing list