[Intel-gfx] [PATCH 4/5] drm/i915/gt: Adopt engine_park synchronisation rules for engine_retire

Thu Nov 21 16:17:59 UTC 2019

On 21/11/2019 14:53, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-21 14:42:56)
>>
>> On 21/11/2019 13:51, Chris Wilson wrote:
>>> In the next patch, we will introduce a new asynchronous retirement
>>> worker, fed by execlists CS events. Here we may queue a retirement as
>>> soon as a request is submitted to HW (and completes instantly), and we
>>> also want to process that retirement as early as possible and cannot
>>> afford to postpone (as there may not be another opportunity to retire it
>>> for a few seconds). To allow the new async retirer to run in parallel
>>> with our submission, pull the __i915_request_queue (that passes the
>>> request to HW) inside the timelines spinlock so that the retirement
>>> cannot release the timeline before we have completed the submission.
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_engine_pm.c | 29 ++++++++++++++++-------
>>>    1 file changed, 21 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> index 373a4b9f159c..bd0af02bea16 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> @@ -74,18 +74,33 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
>>>    #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
>>>    
>>>    static void
>>> -__intel_timeline_enter_and_release_pm(struct intel_timeline *tl,
>>> -                                   struct intel_engine_cs *engine)
>>> +__queue_and_release_pm(struct i915_request *rq,
>>> +                    struct intel_timeline *tl,
>>> +                    struct intel_engine_cs *engine)
>>>    {
>>>        struct intel_gt_timelines *timelines = &engine->gt->timelines;
>>>    
>>> +     /*
>>> +      * We have to serialise all potential retirement paths with our
>>> +      * submission, as we don't want to underflow either the
>>> +      * engine->wakeref.counter or our timeline->active_count.
>>> +      *
>>> +      * Equally, we cannot allow a new submission to start until
>>> +      * after we finish queueing, nor could we allow that submitter
>>> +      * to retire us before we are ready!
>>> +      */
>>>        spin_lock(&timelines->lock);
>>>    
>>> -     if (!atomic_fetch_inc(&tl->active_count))
>>> -             list_add_tail(&tl->link, &timelines->active_list);
>>> +     /* Hand the request over to HW and so engine_retire() */
>>> +     __i915_request_queue(rq, NULL);
>>>    
>>> +     /* Let new submissions commence (and maybe retire this timeline) */
>>>        __intel_wakeref_defer_park(&engine->wakeref);
>>>    
>>> +     /* Let intel_gt_retire_requests() retire us */
>>> +     if (!atomic_fetch_inc(&tl->active_count))
>>> +             list_add_tail(&tl->link, &timelines->active_list);
>>> +
>>>        spin_unlock(&timelines->lock);
>>
>> Now that everything is under the lock the order of operation is not
>> important, or it still is?
> 
> queue before unpark that is required.
> 
> unpark and add_to_timeline, the order is flexible as the lock governors
> the interactions between those and retirers. So I chose to allow the
> next newcomer start a few instructions earlier.

Yes, because of different locks. So the comment above 
__intel_wakeref_defer_park is not correct since timeline cannot be 
retired until the lock is dropped.

It's only preservation of timeline ordering which mandates defer_park 
after request_queue. As far as I am able to summon my own understanding 
from yesterday.

Regards,

Tvrtko