[Intel-gfx] [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display

Daniel Vetter daniel at ffwll.ch
Mon May 12 18:14:53 CEST 2014


On Mon, May 12, 2014 at 06:11:18PM +0200, Daniel Vetter wrote:
> On Mon, May 12, 2014 at 09:05:45AM +0000, Mateo Lozano, Oscar wrote:
> > Hi Daniel,
> > 
> > Sorry, this fell through the cracks:
> > 
> > > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Gracefully handle obj not bound to
> > > GGTT in is_pin_display
> > > 
> > > On Wed, Apr 02, 2014 at 07:21:01PM +0100, oscar.mateo at intel.com wrote:
> > > > From: Oscar Mateo <oscar.mateo at intel.com>
> > > >
> > > > Otherwise, we do a NULL pointer dereference.
> > > >
> > > > I've seen this happen while handling an error in
> > > > i915_gem_object_pin_to_display_plane():
> > > >
> > > > If i915_gem_object_set_cache_level() fails, we call is_pin_display()
> > > > to handle the error. At this point, the object is still not pinned to
> > > > GGTT and maybe not even bound, so we have to check before we
> > > > dereference its GGTT vma.
> > > >
> > > > Issue: VIZ-3772
> > > > Signed-off-by: Oscar Mateo <oscar.mateo at intel.com>
> > > 
> > > Have you looked into provoking this with an igt testcase? On a hunch a busy
> > > load (to extend the race window) plus the usual interruptor trick to jump out of
> > > wait_seqno calls should be able to make this go kaboom on command. But I
> > > haven't analyzed the bug in detail.
> > 
> > AFAICT, the only sequence where this likely to happen (because we are handling a recently created object) is:
> > 
> > intelfb_alloc -> intel_pin_and_fence_fb_obj -> i915_gem_object_pin_to_display_plane -> i915_gem_object_set_cache_level -> is_pin_display
> 
> Pageflipping to a freshly allocated BO without ever touching it beforehand
> should be able to achive the same. If this is really all that's needed.
> 
> But looking at the code a better way should be:
> 1. Create new bo, wrap it in a kms fb.
> 2. Slap busy load onto that bo, e.g. reapeatedly fill it with the blitter.
> 3. Enable evil interruptor (igt_fork_signal_helper).
> 4. Submit pageflip
> 
> -> Boom since the set_cache_level will block, get interrupted and exit
> early with -EINTR.
> 
> Given sufficient overkill in 2. this should be 100% reliable to reproduce.

Aside: Those kinds of tricks are a big reason why I think igts aren't just
useful as testcases, but also to really understand how a bug comes about.
At least ime finally figuring out the last ingredient to make an igt fully
reliably often resulted in a suddenly much clearer understanding of the
bug at hand.

I call this "review by asking for an igt" ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list