[Intel-gfx] [PATCH 11/15] drm/i915/execlists: Cancel banned contexts on schedule-out

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Oct 14 13:38:02 UTC 2019


On 14/10/2019 14:23, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-10-14 14:19:19)
>>
>> On 14/10/2019 14:13, Chris Wilson wrote:
>>> Quoting Chris Wilson (2019-10-14 13:34:35)
>>>> Quoting Tvrtko Ursulin (2019-10-14 13:25:58)
>>>>>
>>>>> On 14/10/2019 13:06, Chris Wilson wrote:
>>>>>> Quoting Tvrtko Ursulin (2019-10-14 13:00:01)
>>>>>>>
>>>>>>> On 14/10/2019 10:07, Chris Wilson wrote:
>>>>>>>> +static void cancel_active(struct i915_request *rq,
>>>>>>>> +                       struct intel_engine_cs *engine)
>>>>>>>> +{
>>>>>>>> +     struct intel_context * const ce = rq->hw_context;
>>>>>>>> +
>>>>>>>> +     /*
>>>>>>>> +      * The executing context has been cancelled. Fixup the context so that
>>>>>>>> +      * it will be marked as incomplete [-EIO] upon resubmission and not
>>>>>
>>>>> (read below first)
>>>>>
>>>>> ... and not misleadingly say "Fixup the context so that it will be
>>>>> marked as incomplete" because there is nothing in this function which
>>>>> does that. It mostly happens by the virtual of context already being
>>>>> marked as banned somewhere else. This comment should just explain the
>>>>> decision to rewind the ring->head for more determinism. It can still
>>>>> mention canceling of user payload and -EIO. Just needs to be clear of
>>>>> the single extra thing achieved here by the head rewind and context edit.
>>>>
>>>> I thought I was clear: "upon resubmission". So use a more active voice to
>>>> reduce ambiguity, gotcha.
>>>
>>>           /*
>>>            * The executing context has been cancelled. We want to prevent
>>>            * further execution along this context and propagate the error on
>>>            * to anything depending on its results.
>>>            *
>>>            * In __i915_request_submit(), we apply the -EIO and remove the
>>>            * requests payload for any banned requests. But first, we must
>>>            * rewind the context back to the start of the incomplete request so
>>>            * that we don't jump back into the middle of the batch.
>>>            *
>>>            * We preserve the breadcrumbs and semaphores of the incomplete
>>>            * requests so that inter-timeline dependencies (i.e other timelines)
>>>            * remain correctly ordered.
>>>            */
>>>
>>
>> Sounds good.
>>
>> Btw.. does this work? :) If context was preempted in the middle of a
>> batch buffer there must be some other state saved (other than RING_HEAD)
>> which on context restore enables it to continue from the right place
>> *within* the batch. Is this code zapping that state as well so GPU will
>> fully forget it was inside the batch?
> 
> Yes. We are resetting the context image back to vanilla, and then
> restore the ring registers to restart from this request. The selftests
> are using spinning batches to simulate the banned context.

Okay, in that case:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko




More information about the Intel-gfx mailing list