[Intel-gfx] [PATCH 03/11] drm/i915/execlists: Suppress redundant preemption

Fri Mar 1 15:07:58 UTC 2019

On 01/03/2019 11:36, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-03-01 11:31:26)
>>
>> ping on below
>>
>> On 28/02/2019 13:11, Tvrtko Ursulin wrote:
>>>
>>> On 26/02/2019 10:23, Chris Wilson wrote:
>>>> On unwinding the active request we give it a small (limited to internal
>>>> priority levels) boost to prevent it from being gazumped a second time.
>>>> However, this means that it can be promoted to above the request that
>>>> triggered the preemption request, causing a preempt-to-idle cycle for no
>>>> change. We can avoid this if we take the boost into account when
>>>> checking if the preemption request is valid.
>>>>
>>>> v2: After preemption the active request will be after the preemptee if
>>>> they end up with equal priority.
>>>>
>>>> v3: Tvrtko pointed out that this, the existing logic, makes
>>>> I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!
>>>>
>>>> v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
>>>> v5: Except not all priorities were made equal, and the WAIT not
>>>> preempting
>>>> is only if we start off as !NEWCLIENT.
>>>>
>>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/intel_lrc.c | 38 ++++++++++++++++++++++++++++----
>>>>    1 file changed, 34 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>>> index 0e20f3bc8210..dba19baf6808 100644
>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>>> @@ -164,6 +164,8 @@
>>>>    #define WA_TAIL_DWORDS 2
>>>>    #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
>>>> +#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
>>>> +
>>>>    static int execlists_context_deferred_alloc(struct i915_gem_context
>>>> *ctx,
>>>>                            struct intel_engine_cs *engine,
>>>>                            struct intel_context *ce);
>>>> @@ -190,8 +192,30 @@ static inline int rq_prio(const struct
>>>> i915_request *rq)
>>>>    static int effective_prio(const struct i915_request *rq)
>>>>    {
>>>> +    int prio = rq_prio(rq);
>>>> +
>>>> +    /*
>>>> +     * On unwinding the active request, we give it a priority bump
>>>> +     * equivalent to a freshly submitted request. This protects it from
>>>> +     * being gazumped again, but it would be preferable if we didn't
>>>> +     * let it be gazumped in the first place!
>>>> +     *
>>>> +     * See __unwind_incomplete_requests()
>>>> +     */
>>>> +    if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {
>>>> +        /*
>>>> +         * After preemption, we insert the active request at the
>>>> +         * end of the new priority level. This means that we will be
>>>> +         * _lower_ priority than the preemptee all things equal (and
>>>> +         * so the preemption is valid), so adjust our comparison
>>>> +         * accordingly.
>>>> +         */
>>>> +        prio |= ACTIVE_PRIORITY;
>>>> +        prio--;
>>>> +    }
>>>> +
>>>>        /* Restrict mere WAIT boosts from triggering preemption */
>>>> -    return rq_prio(rq) | __NO_PREEMPTION;
>>>> +    return prio | __NO_PREEMPTION;
>>>>    }
>>>>    static int queue_prio(const struct intel_engine_execlists *execlists)
>>>> @@ -359,7 +383,7 @@ __unwind_incomplete_requests(struct
>>>> intel_engine_cs *engine)
>>>>    {
>>>>        struct i915_request *rq, *rn, *active = NULL;
>>>>        struct list_head *uninitialized_var(pl);
>>>> -    int prio = I915_PRIORITY_INVALID | I915_PRIORITY_NEWCLIENT;
>>>> +    int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
>>>>        lockdep_assert_held(&engine->timeline.lock);
>>>> @@ -390,9 +414,15 @@ __unwind_incomplete_requests(struct
>>>> intel_engine_cs *engine)
>>>>         * The active request is now effectively the start of a new client
>>>>         * stream, so give it the equivalent small priority bump to prevent
>>>>         * it being gazumped a second time by another peer.
>>>> +     *
>>>> +     * One consequence of this preemption boost is that we may jump
>>>> +     * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
>>>> +     * making those priorities non-preemptible. They will be moved
>>>> forward
>>>
>>> After the previous patch wait priority is non-preemptible by definition
>>> making this suggestion preemption boost is making it so not accurate.
>>>
>>>> +     * in the priority queue, but they will not gain immediate access to
>>>> +     * the GPU.
>>>>         */
>>>> -    if (!(prio & I915_PRIORITY_NEWCLIENT)) {
>>>> -        prio |= I915_PRIORITY_NEWCLIENT;
>>>> +    if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
>>>
>>> What is the importance of the has_started check? Hasn't the active
>>> request been running by definition?
> 
> No. Semaphores. This is all about defending against incorrect promotion
> while a request is still spinning on its dependencies (or else we get
> promoted above them and PI is broken).

Is init_breadcrumb after the semaphore, ie. __i915_request_has_started 
will be false while spinning on the semaphore. That possibly makes 
sense.. But you know what I'll say next. It is extremely subtle and 
sprinkled over the code so here we definitely need a comment explaining it.

Regards,

Tvrtko