[Intel-gfx] [PATCH v4] drm/i915: Show HWSP in intel_engine_dump()
Chris Wilson
chris at chris-wilson.co.uk
Fri Dec 22 19:05:47 UTC 2017
Quoting Chris Wilson (2017-12-22 19:00:24)
> Quoting Tvrtko Ursulin (2017-12-22 18:52:29)
> >
> > On 22/12/2017 18:25, Chris Wilson wrote:
> > > Looking at a CI failure with an ominous line of
> > > [ 362.550715] hangcheck current seqno ffffff6b, last ffffff8c, hangcheck ffffff6b [6016 ms], inflight 118
> > > with no apparent cause for the seqno to be negative, left me wondering
> > > if someone had scribbled over the HWSP. So include the HWSP in the
> > > engine dump to see if there are more signs of random scribbling.
> > >
> > > v2: Fix row pointer, i is now incremented by 8 so doesn't need scaling
> > > by 8, and we don't need to keep volatile here as the status_page isn't
> > > marked up as volatile itself.
> > > v3: Use hexdump, with suppression of identical lines. (Tvrtko)
> > > Which results in
> > >
> > > HWSP:
> > > 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > *
> > > 00000040 00000001 00000000 00000018 00000002 00000001 00000000 00000018 00000000
> > > 00000060 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000003
> > > 00000080 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > *
> > > 000000c0 00000002 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > 000000e0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > *
> > >
> > > instead of 128 lines of mostly 0s.
> > > v4: Tidy up the locals
> > >
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com> #v2
> > > ---
> > > drivers/gpu/drm/i915/intel_engine_cs.c | 31 ++++++++++++++++++++++++++++++-
> > > 1 file changed, 30 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > > index 9856e24c7c43..5eefd420f709 100644
> > > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > > @@ -1667,6 +1667,32 @@ static void print_request(struct drm_printer *m,
> > > rq->timeline->common->name);
> > > }
> > >
> > > +static void hexdump(struct drm_printer *m, const void *buf, size_t len)
> > > +{
> > > + const size_t rowsize = 8 * sizeof(u32);
> > > + const void *prev = NULL;
> > > + bool skip = false;
> > > + size_t row;
> >
> > offset or pos ? Row makes me think it is row-indexed but it is bytes.
>
> pos then. It's the start of the row, but it I felt uncomfortable with
> it being an offset.
>
> > > +
> > > + for (row = 0; row < len; row += rowsize) {
> > > + char line[128];
> >
> > Could dial down the size wrt stack use. 8 * 8 = 64 + 7 spaces + null =
> > 72 should be enough.
>
> But 72 is not a power-of-two!
>
> > > +
> > > + if (prev && !memcmp(prev, buf + row, rowsize)) {
> > > + if (!skip) {
> > > + drm_printf(m, "*\n");
> > > + skip = true;
> > > + }
> > > + continue;
> > > + }
> > > +
> > > + hex_dump_to_buffer(buf + row, len - row, rowsize, sizeof(u32),
> > > + line, sizeof(line), false);
> >
> > Future proof by a WARN_ON_ONCE(ret >= sizeof(line)) ?
>
> Actually had a WARN_ON for testing, seems reasonable to stick one back
> in.
>
> > > + drm_printf(m, "%08zx %s\n", row, line);
> > > + prev = buf + row;
> > > + skip = false;
> > > + }
> > > +}
> > > +
> > > void intel_engine_dump(struct intel_engine_cs *engine,
> > > struct drm_printer *m,
> > > const char *header, ...)
> > > @@ -1869,8 +1895,11 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> > > &engine->irq_posted)),
> > > yesno(test_bit(ENGINE_IRQ_EXECLIST,
> > > &engine->irq_posted)));
> > > +
> > > + drm_printf(m, "HWSP:\n");
> > > + hexdump(m, engine->status_page.page_addr, PAGE_SIZE);
> > > +
> > > drm_printf(m, "Idle? %s\n", yesno(intel_engine_is_idle(engine)));
> > > - drm_printf(m, "\n");
> > > }
> > >
> > > static u8 user_class_map[] = {
> > >
> >
> > Looks okay - with or without the tweaks:
> >
> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>
> Ta, enjoy your break.
Applied the improvements and pushed it along with the other couple of
debugging patches. Now to play wait and see if the errors reoccur in CI.
-Chris
More information about the Intel-gfx
mailing list