[PATCH 2/4] drm/i915: Retire requests along rings

Fri Apr 20 17:16:44 UTC 2018

On 20/04/2018 18:05, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-04-20 17:59:47)
>>
>> On 20/04/2018 15:42, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-04-20 15:28:35)
>>>>
>>>> On 20/04/2018 12:35, Chris Wilson wrote:
>>>>> In the next patch, rings are the central timeline as requests may jump
>>>>> between engines. Therefore in the future as we retire in order along the
>>>>> engine timeline, we may retire out-of-order within a ring (as the ring now
>>>>> occurs along multiple engines), leading to much hilarity in miscomputing
>>>>> the position of ring->head.
>>>>
>>>> Will we be retiring out of order within a ring? I thought we said not,
>>>> that is no parallel execution of a single ring on two engines. And if we
>>>> won't then I don't see how ring is retired out of order.
>>>
>>> Within a ring? No. The problem occurred (without this patch) as the ring
>>> popped up on different engines, and so advance_ring() was called out of
>>> sync (but in order on each engine ofc). As a result ring->head jumped
>>> all over the place.
>>
>> Aaah I get it. Hm, that kind of gives away something about how you
>> implemented the virtual engine, no? So I am not sure splitting this
>> ahead of the rest is the best approach.
> 
> I thought it merited landing early because better separation of work
> between clients has been on our wishlist. Only retiring from our ring
> fits that bill, so I went eureka, send!

Okay from that angle I could buy it. :) Would just want to see what 
happens later before being 100% sure.

>>>>>     void i915_retire_requests(struct drm_i915_private *i915)
>>>>>     {
>>>>> -     struct intel_engine_cs *engine;
>>>>> -     enum intel_engine_id id;
>>>>> +     struct intel_ring *ring, *next;
>>>>>     
>>>>>         lockdep_assert_held(&i915->drm.struct_mutex);
>>>>>     
>>>>>         if (!i915->gt.active_requests)
>>>>>                 return;
>>>>>     
>>>>> -     for_each_engine(engine, i915, id)
>>>>> -             engine_retire_requests(engine);
>>>>> +     list_for_each_entry_safe(ring, next, &i915->gt.rings, link)
>>>>> +             ring_retire_requests(ring);
>>>>
>>>> This is the less efficient bit - but I actually cannot decide if it
>>>> matters in practice (to walk potentially so many more lists), or the
>>>> removal of irq spinlock wins. For real-world use it probably doesn't
>>>> matter. For pathological IGTs we probably don't care.
>>>
>>> Aye. i915_retire_requests() should never be on the critical path; it's
>>> used as the catch up after idling, or by the idle-worker to flush the
>>> requests. It is run under struct_mutex though, and as such not entirely
>>> painless. Solving requests + struct_mutex will certainly be interesting.
>>
>> Ignoring the really uninteresting call sites, it is also called on
>> shrinking and eviction (and perf - but I did not try to understand why).
>> So that might have some impact, although still wouldn't dominate I think.
> 
> All sites where we already hit a slow point and are trying to catch up,
> hoping the system recovered before we have to actually do anything
> drastic; and perf has this nasty requirement of serialising the world
> when it switches on.

Yes, not raising any show stoppers, just thinking out loud.

Regards,

Tvrtko