[Intel-gfx] [PATCH v2] drm/i915/gt: Convert reset prepare failure log to trace
John Harrison
john.c.harrison at intel.com
Tue Dec 5 09:10:57 UTC 2023
On 12/5/2023 00:52, Nirmoy Das wrote:
> gen8_engine_reset_prepare() can fail when HW fails to set
> RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
> error as driver will retry.
>
> Convert the log to a trace log for debugging without triggering
> unnecessary concerns in CI or for end-users during non-fatal scenarios.
I strongly disagree with this change. The hardware spec for the
RESET_CTL and GDRST registers are that they will self clear within a
matter of microseconds. If something is so badly wrong with the hardware
that it can't even manage to reset then that is something that very much
warrants more than a completely silent trace event. It most certainly
should be flagged as a failure in CI.
Just because the driver will retry does not mean that this is not a
serious error. And if the first attempt failed, why would a subsequent
attempt succeed? Escalating to FLR may have more success, but that is
not something that i915 currently does.
John.
>
> v2: Improve commit message(Tvrtko)
>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: John Harrison <John.C.Harrison at Intel.com>
> Cc: Andi Shyti <andi.shyti at linux.intel.com>
> Cc: Andrzej Hajda <andrzej.hajda at intel.com>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
> Reviewed-by: Andi Shyti <andi.shyti at linux.intel.com>
> Reviewed-by: Andrzej Hajda <andrzej.hajda at intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_reset.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index d5ed904f355d..e6fbc6202c80 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -593,10 +593,10 @@ static int gen8_engine_reset_prepare(struct intel_engine_cs *engine)
> ret = __intel_wait_for_register_fw(uncore, reg, mask, ack,
> 700, 0, NULL);
> if (ret)
> - gt_err(engine->gt,
> - "%s reset request timed out: {request: %08x, RESET_CTL: %08x}\n",
> - engine->name, request,
> - intel_uncore_read_fw(uncore, reg));
> + GT_TRACE(engine->gt,
> + "%s reset request timed out: {request: %08x, RESET_CTL: %08x}\n",
> + engine->name, request,
> + intel_uncore_read_fw(uncore, reg));
>
> return ret;
> }
More information about the Intel-gfx
mailing list