[Intel-gfx] [PATCH] drm/i915: Special handling for bonded requests

Fri May 29 07:25:11 UTC 2020

On 28/05/2020 21:23, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-05-28 11:20:07)
>>
>> On 28/05/2020 10:57, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2020-05-27 09:53:22)
>>>> +static void
>>>> +mark_bonded_pair(struct i915_request *rq, struct i915_request *signal)
>>>> +{
>>>> +       /*
>>>> +        * Give (temporary) special meaning to a pair requests with requested
>>>> +        * aligned start along the video engines.
>>>> +        *
>>>> +        * They should be non-preemptable and have all ELSP ports to themselves
>>>> +        * to avoid any deadlocks caused by inversions.
>>>> +        *
>>>> +        * Gen11+
>>>> +        */
>>>> +       if (INTEL_GEN(rq->i915) < 11 ||
>>>> +           rq->engine->class != VIDEO_DECODE_CLASS ||
>>>> +           rq->engine->class != signal->engine->class)
>>>> +               return;
>>>> +
>>>> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &rq->fence.flags);
>>>> +       set_bit(I915_FENCE_FLAG_NOPREEMPT, &signal->fence.flags);
>>>> +
>>>> +       intel_context_set_single_submission(rq->context);
>>>> +       intel_context_set_single_submission(signal->context);
>>>
>>> The thought that just popped into my head:
>>>
>>> This can be after signal is already submitted into ELSP[1].
>>
>> Yep I knew that but thought it would still work.
>>
>> Master in vcs0 port1, slave in vcs1 port0 or queued.
>>
>> If queued that means at worst case another bonded pair is running on
>> same engines, so they should be able to complete.
>>
>> If slave submitted to vcs1 port0 then it will stay there until whatever
>> is in vcs0 port0 finishes and lets master in.
>>
>> Do you see a possibility for things to go bad?
> 
> Because the master is already submitted in port1, the bond can go into
> port0. Then a second bond turns up for the master in port0, and we're
> back at square one.
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 37855ae8f8b3..698608e11df3 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -2502,6 +2502,7 @@ static void eb_request_add(struct i915_execbuffer *eb)
>          lockdep_unpin_lock(&tl->mutex, rq->cookie);
>   
>          trace_i915_request_add(rq);
> +       set_bit(I915_FENCE_FLAG_SENTINEL, &rq->fence.flags);
>   
>          prev = __i915_request_commit(rq);
> 
> Will do the trick.
> 
> (Plus fixing up the rules for assert_pending_valid).

Hmm yes, my logic was flawed by missing to see the async disconnect 
between master and slave submission on both ends. That's why Xiaogang 
was saying slaves must not have no-preempt set... But sentinel on 
everything? Or just everything vcs and gen11+?

So if we indeed had slave preemptible the deadlock would have been 
avoided I think, but can the media pipeline handle that is the question.

Another question is that it sounds it could be possible to work around 
this in userspace, combined with this patch (original thread), if a 
fence was used to block master until slave is submitted.

  split_fence = sw_fence_create()
  execbuf(master, in_fence = split_fence) = master_fence
  execbuf(slave, submit_fence = master_fence)
  sw_fence_advance(split_fence)

That would make sure the single port and no-preempt properties are 
applied before either master or slave can enter elsp.

Sounds tempting to try, thoughts?

Regards,

Tvrtko