[Intel-gfx] [PATCH 3/3] [v2] drm/i915: Capture current context on error
Daniel Vetter
daniel at ffwll.ch
Mon Mar 4 20:57:23 CET 2013
On Sun, Feb 24, 2013 at 06:10:02PM -0800, Ben Widawsky wrote:
> On error, this represents the state of the currently running context at
> the time it was loaded.
>
> Unfortunately, since we're hung and can't switch out the context this
> may not tell us too much about the most current state of the context,
> but does give clues about what has happened since loading.
>
> Thanks to recent doc updates, we have a little more confidence regarding
> what is actually in this memory, and perhaps it will help us gain more
> insight into certain bugs. AFAICT, the most interesting info is in the
> first page. To save space, we only capture the first page. In the
> future, we might want to dump more.
>
> Sample of the relevant part of error state:
> render ring --- HW Context = 0x01b20000
> [0000] 00000000 1100105f 00002028 ffff0880
> [0010] 0000209c feff4040 000020c0 efdf0080
> [0020] 00002178 00000001 0000217c 00145855
> [0030] 00002310 00000000 00002314 00000000
>
> v2: Move error collection to the ring error code
> Change format of dump to not confuse intel_error_decode (Chris)
> Put the context error object with the others (Chris)
> Don't search bound_list instead of active_list (chris)
>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=55845
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> ---
> drivers/gpu/drm/i915/i915_debugfs.c | 15 +++++++++++++++
> drivers/gpu/drm/i915/i915_drv.h | 2 +-
> drivers/gpu/drm/i915/i915_irq.c | 10 ++++++++++
> 3 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 7c65ab8..9b4f3d7 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -772,6 +772,21 @@ static int i915_error_state(struct seq_file *m, void *unused)
> }
> }
> }
> + if ((obj = error->ring[i].ctx)) {
> + seq_printf(m, "%s --- HW Context = 0x%08x\n",
> + dev_priv->ring[i].name,
> + obj->gtt_offset);
> + offset = 0;
> + for (elt = 0; elt < PAGE_SIZE/16; elt+=4) {
> + seq_printf(m, "[%04x] %08x %08x %08x %08x\n",
> + offset,
> + obj->pages[0][elt],
> + obj->pages[0][elt+1],
> + obj->pages[0][elt+2],
> + obj->pages[0][elt+3]);
> + offset += 16;
> + }
> + }
> }
>
> if (error->overlay)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e95337c..bedabcd 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -243,7 +243,7 @@ struct drm_i915_error_state {
> int page_count;
> u32 gtt_offset;
> u32 *pages[0];
> - } *ringbuffer, *batchbuffer;
> + } *ringbuffer, *batchbuffer, *ctx;
> struct drm_i915_error_request {
> long jiffies;
> u32 seqno;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 6a328b8..125b7d8 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1244,6 +1244,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct intel_ring_buffer *ring;
> struct drm_i915_gem_request *request;
> + struct drm_i915_gem_object *obj;
> int i, count;
>
> for_each_ring(ring, dev_priv, i) {
> @@ -1255,6 +1256,15 @@ static void i915_gem_record_rings(struct drm_device *dev,
> error->ring[i].ringbuffer =
> i915_error_object_create(dev_priv, ring->obj);
>
> + /* Currently render ring is the only HW context user */
> + if ((ring->id == RCS) && error->ccid) {
> + list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list)
> + if ((error->ccid & PAGE_MASK) == obj->gtt_offset)
> + error->ring[i].ctx =
> + i915_error_object_create_sized(dev_priv,
> + obj, 1);
checkpatch is a bit unhappy about your hunk in i915_debugfs.c. And it
complains about the overtly long line here. I wanted to quickly fix this
up until I've noticed that the indentation here is a bit ... artistic.
Fixing it up indents the code way too much, I think this should be
extracted into a tiny helper function, which allows us to flatten the
logic with early return;s and continue;s.
First two patches of this series merged to dinq, thanks.
-Daniel
> + }
> +
> count = 0;
> list_for_each_entry(request, &ring->request_list, list)
> count++;
> --
> 1.8.1.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list