[Intel-gfx] [PATCH 3/3] [v2] drm/i915: Capture current context on error

Daniel Vetter daniel at ffwll.ch
Mon Mar 4 20:57:23 CET 2013


On Sun, Feb 24, 2013 at 06:10:02PM -0800, Ben Widawsky wrote:
> On error, this represents the state of the currently running context at
> the time it was loaded.
> 
> Unfortunately, since we're hung and can't switch out the context this
> may not tell us too much about the most current state of the context,
> but does give clues about what has happened since loading.
> 
> Thanks to recent doc updates, we have a little more confidence regarding
> what is actually in this memory, and perhaps it will help us gain more
> insight into certain bugs. AFAICT, the most interesting info is in the
> first page. To save space, we only capture the first page. In the
> future, we might want to dump more.
> 
> Sample of the relevant part of error state:
> render ring --- HW Context = 0x01b20000
> [0000] 00000000 1100105f 00002028 ffff0880
> [0010] 0000209c feff4040 000020c0 efdf0080
> [0020] 00002178 00000001 0000217c 00145855
> [0030] 00002310 00000000 00002314 00000000
> 
> v2: Move error collection to the ring error code
> Change format of dump to not confuse intel_error_decode (Chris)
> Put the context error object with the others (Chris)
> Don't search bound_list instead of active_list (chris)
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=55845
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 15 +++++++++++++++
>  drivers/gpu/drm/i915/i915_drv.h     |  2 +-
>  drivers/gpu/drm/i915/i915_irq.c     | 10 ++++++++++
>  3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 7c65ab8..9b4f3d7 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -772,6 +772,21 @@ static int i915_error_state(struct seq_file *m, void *unused)
>  				}
>  			}
>  		}
> +		if ((obj = error->ring[i].ctx)) {
> +			seq_printf(m, "%s --- HW Context = 0x%08x\n",
> +				   dev_priv->ring[i].name,
> +				   obj->gtt_offset);
> +			offset = 0;
> +			for (elt = 0; elt < PAGE_SIZE/16; elt+=4) {
> +				seq_printf(m, "[%04x] %08x %08x %08x %08x\n",
> +					   offset,
> +					   obj->pages[0][elt],
> +					   obj->pages[0][elt+1],
> +					   obj->pages[0][elt+2],
> +					   obj->pages[0][elt+3]);
> +					offset += 16;
> +			}
> +		}
>  	}
>  
>  	if (error->overlay)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e95337c..bedabcd 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -243,7 +243,7 @@ struct drm_i915_error_state {
>  			int page_count;
>  			u32 gtt_offset;
>  			u32 *pages[0];
> -		} *ringbuffer, *batchbuffer;
> +		} *ringbuffer, *batchbuffer, *ctx;
>  		struct drm_i915_error_request {
>  			long jiffies;
>  			u32 seqno;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 6a328b8..125b7d8 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1244,6 +1244,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_ring_buffer *ring;
>  	struct drm_i915_gem_request *request;
> +	struct drm_i915_gem_object *obj;
>  	int i, count;
>  
>  	for_each_ring(ring, dev_priv, i) {
> @@ -1255,6 +1256,15 @@ static void i915_gem_record_rings(struct drm_device *dev,
>  		error->ring[i].ringbuffer =
>  			i915_error_object_create(dev_priv, ring->obj);
>  
> +		/* Currently render ring is the only HW context user */
> +		if ((ring->id == RCS) && error->ccid) {
> +			list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list)
> +			if ((error->ccid & PAGE_MASK) == obj->gtt_offset)
> +				error->ring[i].ctx =
> +					i915_error_object_create_sized(dev_priv,
> +								       obj, 1);

checkpatch is a bit unhappy about your hunk in i915_debugfs.c. And it
complains about the overtly long line here. I wanted to quickly fix this
up until I've noticed that the indentation here is a bit ... artistic.
Fixing it up indents the code way too much, I think this should be
extracted into a tiny helper function, which allows us to flatten the
logic with early return;s and continue;s.

First two patches of this series merged to dinq, thanks.
-Daniel

> +		}
> +
>  		count = 0;
>  		list_for_each_entry(request, &ring->request_list, list)
>  			count++;
> -- 
> 1.8.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list