[Intel-gfx] [PATCH 2/4] drm/i915: Cancel outstanding work after disabling heartbeats on an engine
Joonas Lahtinen
joonas.lahtinen at linux.intel.com
Fri Sep 25 11:04:09 UTC 2020
Quoting Chris Wilson (2020-09-16 12:42:17)
> We only allow persistent requests to remain on the GPU past the closure
> of their containing context (and process) so long as they are continuously
> checked for hangs or allow other requests to preempt them, as we need to
> ensure forward progress of the system. If we allow persistent contexts
> to remain on the system after the the hangcheck mechanism is disabled,
> the system may grind to a halt. On disabling the mechanism, we sent a
> pulse along the engine to remove all executing contexts from the engine
> which would check for hung contexts -- but we did not prevent those
> contexts from being resubmitted if they survived the final hangcheck.
>
> Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs")
> Testcase: igt/gem_ctx_persistence/heartbeat-stop
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: <stable at vger.kernel.org> # v5.7+
Definitely makes sense to ensure.
Acked-by: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
Regards, Joonas
> ---
> drivers/gpu/drm/i915/gt/intel_engine.h | 9 +++++++++
> drivers/gpu/drm/i915/i915_request.c | 5 +++++
> 2 files changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index 08e2c000dcc3..7c3a1012e702 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -337,4 +337,13 @@ intel_engine_has_preempt_reset(const struct intel_engine_cs *engine)
> return intel_engine_has_preemption(engine);
> }
>
> +static inline bool
> +intel_engine_has_heartbeat(const struct intel_engine_cs *engine)
> +{
> + if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
> + return false;
> +
> + return READ_ONCE(engine->props.heartbeat_interval_ms);
> +}
> +
> #endif /* _INTEL_RINGBUFFER_H_ */
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 436ce368ddaa..0e813819b041 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -542,8 +542,13 @@ bool __i915_request_submit(struct i915_request *request)
> if (i915_request_completed(request))
> goto xfer;
>
> + if (unlikely(intel_context_is_closed(request->context) &&
> + !intel_engine_has_heartbeat(engine)))
> + intel_context_set_banned(request->context);
> +
> if (unlikely(intel_context_is_banned(request->context)))
> i915_request_set_error_once(request, -EIO);
> +
> if (unlikely(fatal_error(request->fence.error)))
> __i915_request_skip(request);
>
> --
> 2.20.1
>
More information about the Intel-gfx
mailing list