[Intel-gfx] [PATCH 1/1] drm/i915: track first and last processes that touch gem objects

Daniel Vetter daniel at ffwll.ch
Mon Feb 6 17:15:44 CET 2012


On Fri, Feb 03, 2012 at 06:02:38PM +0000, Chris Wilson wrote:
> On Fri,  3 Feb 2012 12:43:25 -0200, Eugeni Dodonov <eugeni.dodonov at intel.com> wrote:
> > This allows to hopefully find out who was responsible for the GPU death.
> > We record the 1st and last process to touch each object, to keep track of
> > the process which created the object originally and the last process to
> > touch it.
> > 
> > To simplify post-mortem analysis, we also search for the processes names
> > when gathering the i915_error_state and when peeking at the list of active
> > gem objects in debugfs. This is not perfect for tracking all the
> > processes, as they can quit or die before their batchbuffers got executed,
> > but having to track them during the entire object lifetime would be
> > excessively memcpy hungry.
> 
> I think you've slightly missed here. Tracking who created a buffer is
> interesting and who last used it, but you really need to also track 
> on whose behalf the request (i.e. each batch) is executing.
> 
> For the goal of recording creator, you could just use:
> 
>   obj->creator = current ? current->pid : 0;
> 
> in i915_gem_object_init with 0 as the special value for objects created by
> the driver outside of process context. And similarly for i915_add_request,
> though I'd associate those with the owner of the file_priv.  The important
> point here is that a buffer may be associated with multiple batches
> submitted by one or more clients before a hang is detected, and so unless
> the dispatch pid is tracked you do not know who submitted the erroneous
> batch. (Even a batch may be submitted more than once by many clients,
> given sufficient pathology.) So adding the request queue to the
> i915_error_state would also be interesting, especially with the jiffie
> and ring->tail.
> 
> Also note that there is no direct link between i915_gem_fault() and usage
> of the object, the point at which you want to add the obj->last_used_by
> tracking to is domain management - which catches the usage of CPU
> mappings as well as move-to-active.

I'll second Chris here - I think the interesting stuff is to add some kind
of cheap ownership tracking, not who exactly created the buffer. The
latter is imo only really interesting for resource accounting, and that
would require it to be somewhat more solid. And we don't do any resource
accounting atm anyway.
-Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48



More information about the Intel-gfx mailing list