[Intel-gfx] [PATCH v4] drm/i915: Show HWSP in intel_engine_dump()

Chris Wilson chris at chris-wilson.co.uk
Fri Dec 22 19:05:47 UTC 2017


Quoting Chris Wilson (2017-12-22 19:00:24)
> Quoting Tvrtko Ursulin (2017-12-22 18:52:29)
> > 
> > On 22/12/2017 18:25, Chris Wilson wrote:
> > > Looking at a CI failure with an ominous line of
> > > [  362.550715] hangcheck current seqno ffffff6b, last ffffff8c, hangcheck ffffff6b [6016 ms], inflight 118
> > > with no apparent cause for the seqno to be negative, left me wondering
> > > if someone had scribbled over the HWSP. So include the HWSP in the
> > > engine dump to see if there are more signs of random scribbling.
> > > 
> > > v2: Fix row pointer, i is now incremented by 8 so doesn't need scaling
> > > by 8, and we don't need to keep volatile here as the status_page isn't
> > > marked up as volatile itself.
> > > v3: Use hexdump, with suppression of identical lines. (Tvrtko)
> > >      Which results in
> > > 
> > > HWSP:
> > > 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > *
> > > 00000040 00000001 00000000 00000018 00000002 00000001 00000000 00000018 00000000
> > > 00000060 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000003
> > > 00000080 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > *
> > > 000000c0 00000002 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > 000000e0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > *
> > > 
> > >      instead of 128 lines of mostly 0s.
> > > v4: Tidy up the locals
> > > 
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com> #v2
> > > ---
> > >   drivers/gpu/drm/i915/intel_engine_cs.c | 31 ++++++++++++++++++++++++++++++-
> > >   1 file changed, 30 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > > index 9856e24c7c43..5eefd420f709 100644
> > > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > > @@ -1667,6 +1667,32 @@ static void print_request(struct drm_printer *m,
> > >                  rq->timeline->common->name);
> > >   }
> > >   
> > > +static void hexdump(struct drm_printer *m, const void *buf, size_t len)
> > > +{
> > > +     const size_t rowsize = 8 * sizeof(u32);
> > > +     const void *prev = NULL;
> > > +     bool skip = false;
> > > +     size_t row;
> > 
> > offset or pos ? Row makes me think it is row-indexed but it is bytes.
> 
> pos then. It's the start of the row, but it I felt uncomfortable with
> it being an offset.
>  
> > > +
> > > +     for (row = 0; row < len; row += rowsize) {
> > > +             char line[128];
> > 
> > Could dial down the size wrt stack use. 8 * 8 = 64 + 7 spaces + null = 
> > 72 should be enough.
> 
> But 72 is not a power-of-two!
> 
> > > +
> > > +             if (prev && !memcmp(prev, buf + row, rowsize)) {
> > > +                     if (!skip) {
> > > +                             drm_printf(m, "*\n");
> > > +                             skip = true;
> > > +                     }
> > > +                     continue;
> > > +             }
> > > +
> > > +             hex_dump_to_buffer(buf + row, len - row, rowsize, sizeof(u32),
> > > +                                line, sizeof(line), false);
> > 
> > Future proof by a WARN_ON_ONCE(ret >= sizeof(line)) ?
> 
> Actually had a WARN_ON for testing, seems reasonable to stick one back
> in.
> 
> > > +             drm_printf(m, "%08zx %s\n", row, line);
> > > +             prev = buf + row;
> > > +             skip = false;
> > > +     }
> > > +}
> > > +
> > >   void intel_engine_dump(struct intel_engine_cs *engine,
> > >                      struct drm_printer *m,
> > >                      const char *header, ...)
> > > @@ -1869,8 +1895,11 @@ void intel_engine_dump(struct intel_engine_cs *engine,
> > >                                 &engine->irq_posted)),
> > >                  yesno(test_bit(ENGINE_IRQ_EXECLIST,
> > >                                 &engine->irq_posted)));
> > > +
> > > +     drm_printf(m, "HWSP:\n");
> > > +     hexdump(m, engine->status_page.page_addr, PAGE_SIZE);
> > > +
> > >       drm_printf(m, "Idle? %s\n", yesno(intel_engine_is_idle(engine)));
> > > -     drm_printf(m, "\n");
> > >   }
> > >   
> > >   static u8 user_class_map[] = {
> > > 
> > 
> > Looks okay - with or without the tweaks:
> > 
> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> Ta, enjoy your break.

Applied the improvements and pushed it along with the other couple of
debugging patches. Now to play wait and see if the errors reoccur in CI.
-Chris


More information about the Intel-gfx mailing list