[Intel-gfx] [PATCH] drm/i915/guc: Log engine resets

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri Dec 17 12:15:53 UTC 2021


On 14/12/2021 15:07, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> Log engine resets done by the GuC firmware in the similar way it is done
> by the execlists backend.
> 
> This way we have notion of where the hangs are before the GuC gains
> support for proper error capture.

Ping - any interest to log this info?

All there currently is a non-descriptive "[drm] GPU HANG: ecode 
12:0:00000000".

Also, will GuC be reporting the reason for the engine reset at any point?

Regards,

Tvrtko

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: John Harrison <John.C.Harrison at Intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 97311119da6f..51512123dc1a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -11,6 +11,7 @@
>   #include "gt/intel_context.h"
>   #include "gt/intel_engine_pm.h"
>   #include "gt/intel_engine_heartbeat.h"
> +#include "gt/intel_engine_user.h"
>   #include "gt/intel_gpu_commands.h"
>   #include "gt/intel_gt.h"
>   #include "gt/intel_gt_clock_utils.h"
> @@ -3934,9 +3935,18 @@ static void capture_error_state(struct intel_guc *guc,
>   {
>   	struct intel_gt *gt = guc_to_gt(guc);
>   	struct drm_i915_private *i915 = gt->i915;
> -	struct intel_engine_cs *engine = __context_to_physical_engine(ce);
> +	struct intel_engine_cs *engine = ce->engine;
>   	intel_wakeref_t wakeref;
>   
> +	if (intel_engine_is_virtual(engine)) {
> +		drm_notice(&i915->drm, "%s class, engines 0x%x; GuC engine reset\n",
> +			   intel_engine_class_repr(engine->class),
> +			   engine->mask);
> +		engine = guc_virtual_get_sibling(engine, 0);
> +	} else {
> +		drm_notice(&i915->drm, "%s GuC engine reset\n", engine->name);
> +	}
> +
>   	intel_engine_set_hung_context(engine, ce);
>   	with_intel_runtime_pm(&i915->runtime_pm, wakeref)
>   		i915_capture_error_state(gt, engine->mask);
> 


More information about the Intel-gfx mailing list