[Intel-gfx] [PATCH] drm/i915: fix up error cleanup in i915_gem_object_bind_to_gtt

Daniel Vetter daniel at ffwll.ch
Tue Jul 23 08:10:59 CEST 2013


On Mon, Jul 22, 2013 at 11:02:08PM +0200, Daniel Vetter wrote:
> On Mon, Jul 22, 2013 at 01:12:30PM -0700, Ben Widawsky wrote:
> > On Mon, Jul 22, 2013 at 12:12:38PM +0200, Daniel Vetter wrote:
> > > This has been broken in
> > > 
> > > commit 2f63315692b1d3c055972ad33fc7168ae908b97b
> > > Author: Ben Widawsky <ben at bwidawsk.net>
> > > Date:   Wed Jul 17 12:19:03 2013 -0700
> > > 
> > >     drm/i915: Create VMAs
> > > 
> > > which resulted in an OOPS the first time around we've hit -ENOSPC.
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67156
> > > Cc: Imre Deak <imre.deak at intel.com>
> > > Cc: Ben Widawsky <ben at bwidawsk.net>
> > > Tested-by: meng <mengmeng.meng at intel.com>
> > > Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++------
> > >  1 file changed, 8 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index cfa6588..c87a6ec 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -3121,8 +3121,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > >  
> > >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > >  	if (IS_ERR(vma)) {
> > > -		i915_gem_object_unpin_pages(obj);
> > > -		return PTR_ERR(vma);
> > > +		ret = PTR_ERR(vma);
> > > +		goto err_unpin;
> > >  	}
> > 
> > Adding the extra goto here seems pointless to me.
> 
> Like explained on irc, that's just to have a nice OCD reverse err_foo:
> label stacking at the end of the function.
> 
> > 
> > >  
> > >  search_free:
> > > @@ -3138,17 +3138,17 @@ search_free:
> > >  		if (ret == 0)
> > >  			goto search_free;
> > >  
> > > -		goto err_out;
> > > +		goto err_free_vma;
> > >  	}
> > 
> > My preference would be to exit early in drm_mm_remove_node() if the node
> > isn't allocated. I think at least we should add a WARN to
> > drm_mm_remove_node if the node->allocated == 0.
> 
> Hm, good idea. I'll create a quick patch.
> 
> > >  	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
> > >  					      obj->cache_level))) {
> > >  		ret = -EINVAL;
> > > -		goto err_out;
> > > +		goto err_remove_node;
> > >  	}
> > >  
> > >  	ret = i915_gem_gtt_prepare_object(obj);
> > >  	if (ret)
> > > -		goto err_out;
> > > +		goto err_remove_node;
> > >  
> > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > > @@ -3167,9 +3167,11 @@ search_free:
> > >  	i915_gem_verify_gtt(dev);
> > >  	return 0;
> > >  
> > > -err_out:
> > > +err_remove_node:
> > >  	drm_mm_remove_node(&vma->node);
> > > +err_free_vma:
> > >  	i915_gem_vma_destroy(vma);
> > > +err_unpin:
> > >  	i915_gem_object_unpin_pages(obj);
> > >  	return ret;
> > >  }
> > 
> > Reviewed-by: Ben Widawsky <ben at bwidawsk.net>
> 
> Queued for -next, thanks for the review.

So I think this bug here warrants a testcase in igt. I've expected
gem_tiled_blits/interruptible to hit this, but apparently it does not.
So what we want is to fail the drm_mm_insert_node_in_range_generic call
(with -ENOSPC) and then again fail the i915_gem_evict_something call with
an error. That should be simplest to achieve (and indeed does happen as
the bug indicates) when a blocking wait for the gpu to evict a buffer it's
still using.

Hence why I've thought that gem_tiled_blits/interruptible should hit this,
since it thrashes the gtt (so plenty of -ENOSPC) and has the 2nd thread
interrupting us running in parallel (so we should see the occasional
-ERESTARTSYS). But apparently the tuning is wrong for this case. I guess
cranking up the signal rate (we might need a real interface for that
anyway since the current value is the most the wait ioctl test can stomach
before failing). Or using much bigger objects to have more gtt contention
going on. I'd still only use the first fixed part to do blits, but add a
variable part at then end to mix up the gtt drm_mm as much as possible.

Ofc this should be a new subtest so that we don't reduce existing test
coverage.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list