[Intel-gfx] [CI] drm/i915/execlists: Workaround switching back to a complete context
Mika Kuoppala
mika.kuoppala at linux.intel.com
Fri Mar 27 20:33:29 UTC 2020
Chris Wilson <chris at chris-wilson.co.uk> writes:
> In what seems remarkably similar to the w/a required to not reload an
> idle context with HEAD==TAIL, it appears we must prevent the HW from
> switching to an idle context in ELSP[1], while simultaneously trying to
> preempt the HW to run another context and a continuation of the idle
> context (which is no longer idle).
>
> We can achieve this by preventing the context from completing while we
> reload a new ELSP (by applying ring_set_paused(1) across the whole of
> dequeue), except this eventually fails due to a lite-restore into a
> waiting semaphore does not generate an ACK. Instead, we try to avoid
> making the GPU do anything too challenging and not submit a new ELSP
> while the interrupts + CSB events appear to have fallen behind the
> completed contexts. We expect it to catch up shortly so we queue another
> tasklet execution and hope for the best.
>
> Closes: https://gitlab.freedesktop.org/drm/intel/issues/1501
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_lrc.c | 26 +++++++++++++++++++++++---
> 1 file changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index b12355048501..5f17ece07858 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1915,11 +1915,26 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> * of trouble.
> */
> active = READ_ONCE(execlists->active);
> - while ((last = *active) && i915_request_completed(last))
> - active++;
>
> - if (last) {
> + /*
> + * In theory we can skip over completed contexts that have not
> + * yet been processed by events (as those events are in flight):
> + *
> + * while ((last = *active) && i915_request_completed(last))
> + * active++;
> + *
> + * However, the GPU is cannot handle this as it will ultimately
s/is//
I applaud the straightforward nature of this compared to the pausing.
Albeit this seems to have a cost.
But this should be quite rare event comparatively?
> + * find itself trying to jump back into a context it has just
> + * completed and barf.
> + */
> +
> + if ((last = *active)) {
> if (need_preempt(engine, last, rb)) {
> + if (i915_request_completed(last)) {
> + tasklet_hi_schedule(&execlists->tasklet);
> + return;
> + }
> +
I was pondering of the lost tracing and if you can
work it backwards to this condition.
But I really hope this nails it,
Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> ENGINE_TRACE(engine,
> "preempting last=%llx:%lld, prio=%d, hint=%d\n",
> last->fence.context,
> @@ -1947,6 +1962,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> last = NULL;
> } else if (need_timeslice(engine, last) &&
> timer_expired(&engine->execlists.timer)) {
> + if (i915_request_completed(last)) {
> + tasklet_hi_schedule(&execlists->tasklet);
> + return;
> + }
> +
> ENGINE_TRACE(engine,
> "expired last=%llx:%lld, prio=%d, hint=%d\n",
> last->fence.context,
> --
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
More information about the Intel-gfx
mailing list