[Intel-gfx] [PATCH v6] drm/i915/guc: Add support for reset engine using GuC commands
Chris Wilson
chris at chris-wilson.co.uk
Wed Nov 1 13:58:04 UTC 2017
Quoting Michel Thierry (2017-10-31 22:53:09)
> This patch adds per engine reset and recovery (TDR) support when GuC is
> used to submit workloads to GPU.
>
> In the case of i915 directly submission to ELSP, driver manages hang
> detection, recovery and resubmission. With GuC submission these tasks
> are shared between driver and GuC. i915 is still responsible for detecting
> a hang, and when it does it only requests GuC to reset that Engine. GuC
> internally manages acquiring forcewake and idling the engine before
> resetting it.
>
> Once the reset is successful, i915 takes over again and handles the
> resubmission. The scheduler in i915 knows which requests are pending so
> after resetting a engine, pending workloads/requests are resubmitted
> again.
>
> v2: s/i915_guc_request_engine_reset/i915_guc_reset_engine/ to match the
> non-guc function names.
>
> v3: Removed debug message about engine restarting from which request,
> since the new baseline do it regardless of submission mode. (Chris)
>
> v4: Rebase.
>
> v5: Do not pass unnecessary reporting flags to the fw (Jeff);
> tasklet_schedule(&execlists->irq_tasklet) handles the resubmit; rebase.
>
> v6: Rename the existing reset engine function and share a similar
> interface between guc and non-guc paths (Chris).
>
> Signed-off-by: Michel Thierry <michel.thierry at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_drv.c | 15 +++++++++++++--
> drivers/gpu/drm/i915/i915_drv.h | 2 ++
> drivers/gpu/drm/i915/intel_guc.c | 24 ++++++++++++++++++++++++
> drivers/gpu/drm/i915/intel_guc_fwif.h | 1 +
> drivers/gpu/drm/i915/intel_uncore.c | 5 -----
> 5 files changed, 40 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index af745749509c..359333a423cf 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1950,6 +1950,12 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
> goto finish;
> }
>
> +static inline int intel_gt_reset_engine(struct drm_i915_private *dev_priv,
> + struct intel_engine_cs *engine)
> +{
> + return intel_gpu_reset(dev_priv, intel_engine_flag(engine));
> +}
> +
> /**
> * i915_reset_engine - reset GPU engine to recover from a hang
> * @engine: engine to reset
> @@ -1984,10 +1990,15 @@ int i915_reset_engine(struct intel_engine_cs *engine, unsigned int flags)
> goto out;
> }
>
> - ret = intel_gpu_reset(engine->i915, intel_engine_flag(engine));
> + if (!engine->i915->guc.execbuf_client)
> + ret = intel_gt_reset_engine(engine->i915, engine);
> + else
> + ret = intel_guc_reset_engine(&engine->i915->guc, engine);
> +
> if (ret) {
> /* If we fail here, we expect to fallback to a global reset */
> - DRM_DEBUG_DRIVER("Failed to reset %s, ret=%d\n",
> + DRM_DEBUG_DRIVER("%sFailed to reset %s, ret=%d\n",
> + (engine->i915->guc.execbuf_client ? "GUC ":""),
A bit overkill on the parentheses there ;)
Lgtm, can you please ping, say, Jeff or Daniele for an r-b on the guc
interaction?
-Chris
More information about the Intel-gfx
mailing list