[Intel-gfx] [PATCH 13/15] drm/i915: Emit a user level message when resetting the GPU (or engine)
Chris Wilson
chris at chris-wilson.co.uk
Thu Jul 20 12:52:39 UTC 2017
Quoting Michel Thierry (2017-07-18 01:22:28)
> On 17/07/17 02:11, Chris Wilson wrote:
> > Although a banned context will be told to -EIO off if they try to submit
> > more requests, we have a discrepancy between whole device resets and
> > per-engine resets where we report the GPU reset but not the engine
> > resets. This leaves a bit of mystery as to why the context was banned,
> > and also reduces awareness overall of when a GPU (engine) reset occurs
> > with its possible side-effects.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Michel Thierry <michel.thierry at intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_drv.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index bc121a46ed9a..4b62fd012877 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1865,9 +1865,10 @@ void i915_reset(struct drm_i915_private *dev_priv)
> > if (!i915_gem_unset_wedged(dev_priv))
> > goto wakeup;
> >
> > + dev_notice(dev_priv->drm.dev,
> > + "Resetting chip after gpu hang\n");
> > error->reset_count++;
> >
> > - pr_notice("drm/i915: Resetting chip after gpu hang\n");
> > disable_irq(dev_priv->drm.irq);
> > ret = i915_gem_reset_prepare(dev_priv);
> > if (ret) {
> > @@ -1945,7 +1946,9 @@ int i915_reset_engine(struct intel_engine_cs *engine)
> >
> > GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
> >
> > - DRM_DEBUG_DRIVER("resetting %s\n", engine->name);
> > + dev_notice(engine->i915->drm.dev,
> > + "Resetting %s after gpu hang\n", engine->name);
> > + error->reset_engine_count[engine->id]++;
> >
>
> This will increment both the engine-reset-count and gpu-reset count in
> the unlikely case that engine-reset gets promoted to full reset.
>
> Not a problem per-se, but I wanted to point it out (plus it makes both
> functions symmetric).
I felt it was justified as then we always increment either counter on
every attempt, not just success, which was the behaviour for the global
counter. I guess should split that out since it is unrelated.
-Chris
More information about the Intel-gfx
mailing list