[Intel-gfx] [PATCH 1/1] drm/i915: track pid of GEM object creators

Ben Widawsky ben at bwidawsk.net
Thu Feb 2 19:27:14 CET 2012


On Thu, Feb 02, 2012 at 02:52:22PM -0200, Eugeni Dodonov wrote:
> This allows to hopefully find out who was responsible for the GPU death.
> 
> To simplify post-portem analysis, we also search for the the processes
> names when gathering the i915_error_state and when peeking at the list of
> active gem objects in debugfs.
> 
> Signed-off-by: Eugeni Dodonov <eugeni.dodonov at intel.com>

s/portem/mortem

I'd recommend adding some logic for flink'd buffers. Perhaps a list of
processes is too tedious, but just knowing a buffer was exported, is a
start.

How will this actually work? Since the error occurs asynchronously,
surely another process could have come in and submitted work with that
BO (in the semaphores case especially) at which point you'll overwrite
obj->pid. Now probably you were aware of this, but I think this makes
the feature almost unusable for trickier bugs, and in those cases at
best it's misleading, and at worst it's harmful.

I once thought about this as well, and the only coherent way to do it is
to track per seqno all this information. Maybe that's even worth doing
with certain debug flags?

I guess I'd feel a lot better about this if you had actually used this
feature to debug non-trivial bugs, and if you have - describe them and
I'll offer up my r-b.


> [...]



More information about the Intel-gfx mailing list