[Intel-gfx] [PATCH i-g-t] lib/igt_gt.c : allow changes to stop_rings mode bits

Gore, Tim tim.gore at intel.com
Mon Jul 13 09:07:14 PDT 2015



Tim GoreĀ 
Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ


> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Monday, July 13, 2015 3:59 PM
> To: Gore, Tim
> Cc: Daniel Vetter; intel-gfx at lists.freedesktop.org; Wood, Thomas
> Subject: Re: [Intel-gfx] [PATCH i-g-t] lib/igt_gt.c : allow changes to stop_rings
> mode bits
> 
> On Mon, Jul 13, 2015 at 09:43:11AM +0000, Gore, Tim wrote:
> >
> >
> > Tim Gore
> > Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon
> > SN3 1RJ
> >
> > > -----Original Message-----
> > > From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of
> > > Daniel Vetter
> > > Sent: Monday, July 13, 2015 10:30 AM
> > > To: Gore, Tim
> > > Cc: intel-gfx at lists.freedesktop.org; Wood, Thomas
> > > Subject: Re: [Intel-gfx] [PATCH i-g-t] lib/igt_gt.c : allow changes
> > > to stop_rings mode bits
> > >
> > > On Fri, Jul 10, 2015 at 02:06:06PM +0100, tim.gore at intel.com wrote:
> > > > From: Tim Gore <tim.gore at intel.com>
> > > >
> > > > In function igt_set_stop_rings, the test
> > > >   igt_assert_f(flags == 0 || current == 0, ..
> > > >
> > > > will fail if we are trying to force a hang but the
> > > > STOP_RINGS_ALLOW_BAN or STOP_RINGS_ALLOW_ERROR bit is set.
> > > > With the introduction of per ring resets in the driver (in
> > > > android) these bits do not get cleared to zero when an individual
> > > > ring is reset. This causes subsequent attempt to cause a ring hang
> > > > via this function to fail, leading to several igt tests failing
> > > > (ie gem_reset_stats subtest ban-xxx etc).
> > >
> > > Fix tdr to reset these instead?
> > > -Daniel
> > >
> > I could change tdr, but why. When the TDR handles a ring hang and
> > resets the ring, why would it modify the flag that defines if the
> > driver should ban a frequently hanging context? If we get rid of the
> > stop_rings interface, as Chris Wilson suggested, we would still need
> > to keep the STOP_RING_ALLOW_BAN/ALLOW_ERRORS bits in debugfs, but
> you
> > would not expect to have to re-write these bits each time there is a ring
> reset.
> 
> The fix current hang recover code to no reset this, add some grace period,
> then push this patch to igt. We don't have full-blown abi guarantees for
> debugfs/igt stuff, but I want at least a few months (really last released
> kernel&igt) of backwards/forward compatibility. And inconsistent behaviour
> isn't great imo.
> -Daniel

Sorry Daniel, I didn't really follow that.
I didn't want a gpu reset to clear the ALLOW_BAN bit, since this will defeat the
point of this bit, except perhaps in test situations where you can keep setting it
each time you deliberately cause a hang. It seems like the ALLOW_BAN bit has
uses in real world situations, although I don't know it anyone uses it.

Would you be more comfortable with
Igt_assert_f ( ( flags ==0 ) || 
     (( current & STOP_RING_ALL) ==0)  && ((current ^ flags) & ~ STOP_RING_ALL == 0 ) )

So the either the new flags must be 0 (currently allowed) or the existing flags must
indicate that  all hangs are cleared (0 except possibly the mode bits) AND the mode
bits you are writing are the same as the current values. ??

  Tim
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


More information about the Intel-gfx mailing list