[Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri Jan 29 09:24:18 UTC 2021


On 28/01/2021 22:44, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2021-01-28 16:42:44)
>>
>> On 28/01/2021 16:26, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2021-01-28 15:56:19)
>>>> On 25/01/2021 14:01, Chris Wilson wrote:
>>>>> diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
>>>>> index bc2fa84f98a8..1200c3df6a4a 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_priolist_types.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_priolist_types.h
>>>>> @@ -38,10 +38,36 @@ enum {
>>>>>     #define I915_PRIORITY_UNPREEMPTABLE INT_MAX
>>>>>     #define I915_PRIORITY_BARRIER (I915_PRIORITY_UNPREEMPTABLE - 1)
>>>>>     
>>>>> +#ifdef CONFIG_64BIT
>>>>> +#define I915_PRIOLIST_HEIGHT 12
>>>>> +#else
>>>>> +#define I915_PRIOLIST_HEIGHT 11
>>>>> +#endif
>>>>
>>>> I did not get this. On one hand I could think pointers are larger on
>>>> 64-bit so go for fewer levels, if size was a concern. But on the other
>>>> hand 32-bit is less important these days, definitely much less as a
>>>> performance platform. So going for less memory use => worse performance
>>>> on a less important platform, which typically could be more memory
>>>> constrained? Not sure I see it as that important either way to be
>>>> distinctive but a comment would satisfy me.
>>>
>>> Just aligned to the cacheline. The struct is 128B on 64b and 64B on 32b.
>>> On 64B, we will scale to around 16 million requests in flight and 4
>>> million on 32b. Which should be enough.
>>>
>>> If we shrunk 64b to a 64B node, we would only scale to 256 requests
>>> which limit we definitely will exceed.
>>
>> Ok thanks, pouring it into a comment is implied.
>>
>>>
>>>>>     struct i915_priolist {
>>>>>         struct list_head requests;
>>>>
>>>> What would be on this list? Request can only be on one at a time, so I
>>>> was thinking these nodes would have pointers to list of that priority,
>>>> rather than lists themselves. Assuming there can be multiple nodes of
>>>> the same priority in the 2d hierarcy. Possibly I don't understand the
>>>> layout.
>>>
>>> A request is only on one list (queue, active, hold). But we may still
>>> have more than one request at the same deadline, though that will likely
>>> be limited to priority-inheritance and timeslice deferrals.
>>>
>>> Since we would need pointer to the request, we could only reclaim a
>>> single pointer here, which is not enough to warrant reducing the overall
>>> node size. And while there is at least one user of request->sched.link,
>>> the list maintenance will still be incurred. Using request->sched.link
>>> remains a convenient interface.
>>
>> Lost you.
> 
> /*
>   * i915_priolist forms a skiplist. The skiplist is built in layers,
>   * starting at the base [0] is a singly linked list of all i915_priolist.
>   * Each higher layer contains a fraction of the i915_priolist from the
>   * previous layer:
>   *
>   * S[0] 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF S
>   * E[1] >1>3>5>7>9>B>D>F>1>3>5>7>9>B>D>F>1>3>5>7>9>B>D>F>1>3>5>7>9>B>D>F E
>   * N[2] -->3-->7-->B-->F-->3-->7-->B-->F-->3-->7-->B-->F-->3-->7-->B-->F N
>   * T[3] ------->7----->F-------7------>F------>7------>F------>7------>F 

Just align this first 7.

T
>   * I[4] -------------->F-------------->F-------------->F-------------->F I
>   * N[5] ------------------------------>F------------------------------>F N
>   * E[6] ------------------------------>F-------------------------------> E
>   * L[7] ---------------------------------------------------------------> L
>   *
>   * To iterate through all active i915_priolist, we only need to follow
>   * the chain in i915_priolist.next[0] (see for_each_priolist).
>   *
>   * To quickly find a specific key (or insert point), we can perform a binary
>   * search by starting at the highest level and following the linked list
>   * at that level until we either find the node, or have gone passed the key.
>   * Then we descend a level, and start walking the list again starting from
>   * the current position, until eventually we find our key, or we run out of

 From the previous on the current level, not current I believe. So go 
previous before descending, right?

Very useful diagram, thank you.

Regards,

Tvrtko


More information about the Intel-gfx mailing list