[Intel-gfx] [PATCH 10/14] drm/i915: Start chopping up the GPU error capture

Chris Wilson chris at chris-wilson.co.uk
Thu Jan 9 15:40:12 UTC 2020


Quoting Andi Shyti (2020-01-09 15:31:15)
> Hi Chris,
> 
> On Thu, Jan 09, 2020 at 08:58:35AM +0000, Chris Wilson wrote:
> > In the near future, we will want to start a GPU error capture from a new
> > context, from inside the softirq region of a forced preemption. To do
> > so requires us to break up the monolithic error capture to provide new
> > entry points with finer control; in particular focusing on one
> > engine/gt, and being able to compose an error state from little pieces
> > of HW capture.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Andi Shyti <andi.shyti at intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_engine.h       |    2 +-
> >  drivers/gpu/drm/i915/gt/intel_engine_cs.c    |    6 +-
> >  drivers/gpu/drm/i915/gt/intel_ggtt.c         |    3 +
> >  drivers/gpu/drm/i915/gt/intel_gtt.h          |    1 +
> >  drivers/gpu/drm/i915/gt/intel_reset.c        |    2 +-
> >  drivers/gpu/drm/i915/gt/selftest_hangcheck.c |    2 +-
> >  drivers/gpu/drm/i915/i915_debugfs.c          |   14 +-
> >  drivers/gpu/drm/i915/i915_drv.h              |    2 +-
> >  drivers/gpu/drm/i915/i915_gpu_error.c        | 1169 ++++++++++--------
> >  drivers/gpu/drm/i915/i915_gpu_error.h        |  328 +++--
> >  drivers/gpu/drm/i915/i915_sysfs.c            |    6 +-
> 
> don't we want to have a gt/intel_gt_error.[ch] at some point?

I did give it some thought, and at the moment i915_gpu_error.c exists in
its own little bubble on the outside of the driver. That isn't to say
we couldn't keep gt/error_(engine|gt).c in the same bubble, but it was
easier to keep it where it was and hack it provide an engine capture
interface.

I think it is the direction we want to go in, but I think the first step
is make the output file structured (yaml is my pick) so that we can
rearrange, extend, remove bits and bobs without upsetting consumers. I
was very close to making the transition to yaml, but decided to bluster
through anyway since we dare not release a kernel where error capture is
disabled-by-default. They have their pitchforks at the ready.
-Chris


More information about the Intel-gfx mailing list