[Intel-gfx] [PATCH] drm/i915: Fix erroneous dereference of batch_obj inside reset_status
Daniel Vetter
daniel at ffwll.ch
Thu Dec 12 10:54:20 CET 2013
On Thu, Dec 05, 2013 at 04:22:02PM +0000, Chris Wilson wrote:
> On Thu, Dec 05, 2013 at 06:07:27PM +0200, Mika Kuoppala wrote:
> > Chris Wilson <chris at chris-wilson.co.uk> writes:
> >
> > > As the rings may be processed and their requests deallocated in a
> > > different order to the natural retirement during a reset,
> > >
> > > /* Whilst this request exists, batch_obj will be on the
> > > * active_list, and so will hold the active reference. Only when this
> > > * request is retired will the the batch_obj be moved onto the
> > > * inactive_list and lose its active reference. Hence we do not need
> > > * to explicitly hold another reference here.
> > > */
> > >
> > > is violated, and the batch_obj may be dereferenced after it had been
> > > freed on another ring. This can be simply avoided by processing the
> > > status update prior to deallocating any requests.
> > >
> > > Fixes regression (a possible OOPS following a GPU hang) from
> > > commit aa60c664e6df502578454621c3a9b1f087ff8d25
> > > Author: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > > Date: Wed Jun 12 15:13:20 2013 +0300
> > >
> > > drm/i915: find guilty batch buffer on ring resets
> > >
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> > > Cc: stable at vger.kernel.org
> >
> > Passes the igt/gem_reset_stats/close-pending-fork and
> > doesn't affect the fast path.
> >
> > Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>
>
> For reference, this is the comment I added upon request:
>
> @@ -2502,8 +2508,15 @@ void i915_gem_reset(struct drm_device *dev)
> struct intel_ring_buffer *ring;
> int i;
>
> + /* Before we free the objects from the requests, we need to inspect
> + * them for finding the guilty party. As the requests only borrow
> + * their reference to the objects, the inspection must be done first.
> + */
> + for_each_ring(ring, dev_priv, i)
> + i915_gem_reset_ring_status(dev_priv, ring);
> +
> for_each_ring(ring, dev_priv, i)
> - i915_gem_reset_ring_lists(dev_priv, ring);
> + i915_gem_reset_ring_cleanup(dev_priv, ring);
>
> i915_gem_cleanup_ringbuffer(dev);
QA didn't hit this bug since the test wasn't added to the right make
target. With that sorted I've now merged this patch (including comment) to
-fixes.
Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list