[Intel-gfx] [PATCH igt] lib/gt: Insert an arbitration point in our hang batch

Ville Syrjälä ville.syrjala at linux.intel.com
Fri Oct 27 13:24:16 UTC 2017


On Fri, Oct 27, 2017 at 01:59:17PM +0100, Chris Wilson wrote:
> Quoting Ville Syrjälä (2017-10-27 13:54:49)
> > On Fri, Oct 27, 2017 at 01:45:35PM +0100, Chris Wilson wrote:
> > > A purely recursive batch has the downside that it is a severe drain on
> > > system resources (see commit f978cc027cd0 "lib/dummyload: Pad with a few
> > > nops so that we do not completely hog the system") which can result in
> > > the test being starved and failing to make reasonably progress. For more
> > > reliable resets, also include an arbitration point. This should lessen
> > > the efficacy of the hang...
> > > 
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > > ---
> > >  lib/igt_gt.c | 10 +++++-----
> > >  1 file changed, 5 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/lib/igt_gt.c b/lib/igt_gt.c
> > > index 89727d22..2aebad14 100644
> > > --- a/lib/igt_gt.c
> > > +++ b/lib/igt_gt.c
> > > @@ -313,12 +313,12 @@ igt_hang_t igt_hang_ctx(int fd,
> > >       len = 2;
> > >       if (intel_gen(intel_get_drm_devid(fd)) >= 8)
> > >               len++;
> > > -     b[0] = MI_BATCH_BUFFER_START | (len - 2);
> > > -     b[len] = MI_BATCH_BUFFER_END;
> > > -     b[len+1] = MI_NOOP;
> > > -     gem_write(fd, exec.handle, 0, b, sizeof(b));
> > > +     b[0] = 0x5 << 24; /* ARB_CHk */
> > 
> > That seems to be gen4+ only. Also
> > "This instruction can be placed only in a ring buffer, never in a batch
> > buffer."
> 
> Idea of since/until when? It's definitely listed for batches on gen8+.

If I'm reading things correctly then IVB+ can have it in an RCS batch.
But for the other engines it seems that the "ring only" restriction
still applies, until some *future* gen.

> The good news being an unrecognised MI cmd it is skipped. Better news if
> it causes a GPU hang ;)

Based on the description it could be that it just blindly jumps to UHPTR
and there wouldn't be any way to get back to the batch. Maybe. That's
assuming UHPTR would have the valid bit set in the first place. I guess
if UHPTR isn't valid then nothing should happend whether or not the
parser accepts the command.

I don't actually know how you would get back to the previous point on
the ring either after jumping to UHPTR from the ring. Is the previous
HEAD saved somewhere?

-- 
Ville Syrjälä
Intel OTC


More information about the Intel-gfx mailing list