[Intel-gfx] [PATCH 19/27] drm/i915: Replace hangcheck by heartbeats

Joonas Lahtinen joonas.lahtinen at linux.intel.com
Fri Sep 27 08:26:52 UTC 2019


Quoting Chris Wilson (2019-09-25 13:01:29)
> Replace sampling the engine state every so often with a periodic
> heartbeat request to measure the health of an engine. This is coupled
> with the forced-preemption to allow long running requests to survive so
> long as they do not block other users.
> 
> v2: Couple in sysfs controls
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Jon Bloomfield <jon.bloomfield at intel.com>
> Reviewed-by: Jon Bloomfield <jon.bloomfield at intel.com>

<SNIP>

> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -37,3 +37,14 @@ config DRM_I915_PREEMPT_TIMEOUT
>           to execute.
>  
>           May be 0 to disable the timeout.
> +
> +config DRM_I915_HEARTBEAT_INTERVAL
> +       int "Interval between heartbeat pulses (ms)"
> +       default 2500 # microseconds

"ms" or "us", pick one?

> +       help
> +         While active the driver uses a periodic request, a heartbeat, to
> +         check the wellness of the GPU and to regularly flush state changes
> +         (idle barriers).
> +
> +         May be 0 to disable heartbeats and therefore disable automatic GPU
> +         hang detection.

Worth to mention this can be overridden from sysfs.

> +static void heartbeat(struct work_struct *wrk)
> +{

<SNIP>

> +       if (i915_modparams.enable_hangcheck)
> +               engine->heartbeat.systole = i915_request_get(rq);

I'd be more inclined to the userspace opt-in for running indefinitely and
getting rid of the modparam completely.

The long workloads might even not pre-empt at desired granularity.

Regards, Joonas


More information about the Intel-gfx mailing list