[Intel-gfx] [PATCH] drm/i915: Fix erroneous dereference of batch_obj inside reset_status

Daniel Vetter daniel at ffwll.ch
Thu Dec 12 10:54:20 CET 2013


On Thu, Dec 05, 2013 at 04:22:02PM +0000, Chris Wilson wrote:
> On Thu, Dec 05, 2013 at 06:07:27PM +0200, Mika Kuoppala wrote:
> > Chris Wilson <chris at chris-wilson.co.uk> writes:
> > 
> > > As the rings may be processed and their requests deallocated in a
> > > different order to the natural retirement during a reset,
> > >
> > > /* Whilst this request exists, batch_obj will be on the
> > >  * active_list, and so will hold the active reference. Only when this
> > >  * request is retired will the the batch_obj be moved onto the
> > >  * inactive_list and lose its active reference. Hence we do not need
> > >  * to explicitly hold another reference here.
> > >  */
> > >
> > > is violated, and the batch_obj may be dereferenced after it had been
> > > freed on another ring. This can be simply avoided by processing the
> > > status update prior to deallocating any requests.
> > >
> > > Fixes regression (a possible OOPS following a GPU hang) from
> > > commit aa60c664e6df502578454621c3a9b1f087ff8d25
> > > Author: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > > Date:   Wed Jun 12 15:13:20 2013 +0300
> > >
> > >     drm/i915: find guilty batch buffer on ring resets
> > >
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> > > Cc: stable at vger.kernel.org
> > 
> > Passes the igt/gem_reset_stats/close-pending-fork and
> > doesn't affect the fast path.
> > 
> > Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>
> 
> For reference, this is the comment I added upon request:
> 
> @@ -2502,8 +2508,15 @@ void i915_gem_reset(struct drm_device *dev)
>         struct intel_ring_buffer *ring;
>         int i;
>  
> +       /* Before we free the objects from the requests, we need to inspect
> +        * them for finding the guilty party. As the requests only borrow
> +        * their reference to the objects, the inspection must be done first.
> +        */
> +       for_each_ring(ring, dev_priv, i)
> +               i915_gem_reset_ring_status(dev_priv, ring);
> +
>         for_each_ring(ring, dev_priv, i)
> -               i915_gem_reset_ring_lists(dev_priv, ring);
> +               i915_gem_reset_ring_cleanup(dev_priv, ring);
>  
>         i915_gem_cleanup_ringbuffer(dev);

QA didn't hit this bug since the test wasn't added to the right make
target. With that sorted I've now merged this patch (including comment) to
-fixes.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list