[Intel-gfx] [PATCH 1/3] drm/i915: Rely on accurate request tracking for finding hung batches

Ben Widawsky ben at bwidawsk.net
Wed Feb 12 20:02:34 CET 2014


On Wed, Feb 12, 2014 at 08:25:58AM +0000, Chris Wilson wrote:
> On Tue, Feb 11, 2014 at 05:57:19PM -0800, Ben Widawsky wrote:
> > On Mon, Feb 10, 2014 at 04:30:49PM +0200, Mika Kuoppala wrote:
> > > -static struct drm_i915_gem_request *
> > > -i915_gem_find_first_non_complete(struct intel_ring_buffer *ring)
> > > +struct drm_i915_gem_request *
> > > +i915_gem_find_active_request(struct intel_ring_buffer *ring)
> > >  {
> > >  	struct drm_i915_gem_request *request;
> > > -	const u32 completed_seqno = ring->get_seqno(ring, false);
> > > +	u32 completed_seqno;
> > > +
> > > +	if (WARN_ON(!ring->get_seqno))
> > > +		return NULL;
> > 
> > ring->get_seqno(ring, false) ?
> 
> ?
> 
> This was originally used in the error capture code to detect
> uninitialised rings.

We have a new flag for that now, don't we? Also, I don't think we can
actually have an uninitialized ring at this point.

> 
> > This looks like it's currently wrong below as well.
> 
> ?
>  
> > I still would have preferred to keep both methods for a while until we
> > felt really comfortable with this... the commit message's statement that
> > getting bad error state is inevitably a good thing is total bunk, I
> > think.
> 
> I strongly disagree. One of the critical factors you first scan for when
> reading error states is whether it is self-consistent. If the error
> state is inconsistent, we also know that some of our critical
> infrastructure is wrong - which we can't know if the error capture code
> was different. And in the cases where the error state is inconsistent,
> the scanning method would also be snafu (because ACTHD is no longer
> within the expected bounds).
> -Chris
> 

The ACTHD case is one of the reasons I can't argue for keeping the old
method. (Though I think capturing all buffers with COMMAND domain gets
you to the same end). Since you read way more error states than I, I'm
happy to favor your opinion. As I said [but you cut] I am just a
chicken.

You already got my r-b, stop arguing.

> -- 
> Chris Wilson, Intel Open Source Technology Centre

-- 
Ben Widawsky, Intel Open Source Technology Center



More information about the Intel-gfx mailing list