[Intel-gfx] [PATCH v10] drm/i915: Extend LRC pinning to cover GPU context writeback

Tue Jan 19 09:31:49 PST 2016

On 19/01/16 17:18, Tvrtko Ursulin wrote:
>
> On 19/01/16 10:24, Tvrtko Ursulin wrote:
>>
>>
>> On 18/01/16 20:47, Chris Wilson wrote:
>>> On Mon, Jan 18, 2016 at 05:14:26PM +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 18/01/16 16:53, Chris Wilson wrote:
>>>>> On Mon, Jan 18, 2016 at 03:02:25PM +0000, Tvrtko Ursulin wrote:
>>>>>> -       while (!list_empty(&ring->request_list)) {
>>>>>> -               struct drm_i915_gem_request *request;
>>>>>> -
>>>>>> -               request = list_first_entry(&ring->request_list,
>>>>>> -                                          struct
>>>>>> drm_i915_gem_request,
>>>>>> -                                          list);
>>>>>> -
>>>>>> -               if (!i915_gem_request_completed(request, true))
>>>>>> +       list_for_each_entry_safe(req, next, &ring->request_list,
>>>>>> list) {
>>>>>> +               if (!i915_gem_request_completed(req, true))
>>>>>>                          break;
>>>>>>
>>>>>> -               i915_gem_request_retire(request);
>>>>>> +               if (!i915.enable_execlists ||
>>>>>> !i915.enable_guc_submission) {
>>>>>> +                       i915_gem_request_retire(req);
>>>>>> +               } else {
>>>>>> +                       prev_req = list_prev_entry(req, list);
>>>>>> +                       if (prev_req)
>>>>>> +                               i915_gem_request_retire(prev_req);
>>>>>> +               }
>>>>>>          }
>>>>>>
>>>>>> To explain, this attempts to ensure that in GuC mode requests are
>>>>>> only
>>>>>> unreferenced if there is a *following* *completed* request.
>>>>>>
>>>>>> This way, regardless of whether they are using the same or different
>>>>>> contexts, we can be sure that the GPU has either completed the
>>>>>> context writing, or that the unreference will not cause the final
>>>>>> unpin of the context.
>>>>>
>>>>> This is the first bogus step. contexts have to be unreferenced from
>>>>> request retire, not request free. As it stands today, this forces
>>>>> us to
>>>>> hold the struct_mutex for the free (causing many foul ups along the
>>>>> line).  The only reason why it is like that is because of execlists
>>>>> not
>>>>> decoupling its context pinning inside request cancel.
>>>>
>>>> What is the first bogus step? My idea of how to fix the GuC issue,
>>>> or the mention of final unreference in relation to GPU completing
>>>> the submission?
>>>
>>> That we want to want to actually unreference the request. We want to
>>> unpin the context at the appropriate juncture. At the moment, it looks
>>
>> What would be the appropriate juncture? With GuC we don't have the
>> equivalent of context complete irq.
>>
>>> like that you are conflating those two steps: "requests are only
>>> unreferenced". Using the retirement mechanism would mean coupling the
>>> context unpinning into a subsequent request rather than defer retiring a
>>> completed request, for example legacy uses active vma tracking to
>>> accomplish the same thing. Aiui, the current claim is that we couldn't
>>> do that since the guc may reorder contexts - except that we currently
>>> use a global seqno so that would be bad on many levels.
>>
>> I don't know legacy. :( I can see that request/context lifetime is
>> coupled there and associated with request creation to retirement.
>>
>> Does it have the same problem of seqno signaling completion before the
>> GPU is done with writing out the context image and how does it solve
>> that?
>
> Ok I think I am starting to see the legacy code paths.
>
> Interesting areas are i915_switch_context + do_switch which do the
> ring->last_context tracking and make the ring/engine own one extra
> reference on the context.
>
> Then, code paths which want to make sure no user context are active on
> the GPU call i915_gpu_idle and submit a dummy default context request.
>
> The latter even explicitly avoids execlist mode.
>
> So unless I am missing something, we could just unify the behaviour
> between the two. Make ring/engine->last_context do the identical
> tracking as legacy context switching and let i915_gpu_idle idle the GPU
> in execlist mode as well?

Although I am not sure the engine->last_context concept works with LRC 
and GuC because of the multiple submission ports. Need to give it more 
thought.

Regards,

Tvrtko