[Intel-gfx] [PATCH v5 1/2] drm/i915: Fix failure paths around initial fbdev allocation

Ville Syrjälä ville.syrjala at linux.intel.com
Thu Oct 15 10:34:23 PDT 2015


On Thu, Oct 15, 2015 at 07:14:35PM +0200, Lukas Wunner wrote:
> Hi Ville,
> 
> On Tue, Oct 13, 2015 at 06:04:40PM +0300, Ville Syrjälä wrote:
> > On Tue, Jun 30, 2015 at 10:06:27AM +0100, Lukas Wunner wrote:
> > > From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > 
> > > We had two failure modes here:
> > > 
> > > 1.
> > > Deadlock in intelfb_alloc failure path where it calls
> > > drm_framebuffer_remove, which grabs the struct mutex and intelfb_create
> > > (caller of intelfb_alloc) was already holding it.
> > > 
> > > 2.
> > > Deadlock in intelfb_create failure path where it calls
> > > drm_framebuffer_unreference, which grabs the struct mutex and
> > > intelfb_create was already holding it.
> > > 
> > > v2:
> > >    * Reformat commit msg to 72 chars. (Lukas Wunner)
> > >    * Add third failure mode. (Lukas Wunner)
> > > 
> > > v3:
> > >    * On fb alloc failure, unref gem object where it gets refed,
> > >      fix double unref in separate commit. (Ville Syrjälä)
> > > 
> > > v4:
> > >    * Lock struct mutex on unref. (Chris Wilson)
> > > 
> > > v5:
> > >    * Rebase on drm-intel-nightly 2015y-09m-04d-08h-19m-35s UTC,
> > >      rephrase commit message. (Jani Nicula)
> > > 
> > > Tested-by: Pierre Moreau <pierre.morrow at free.fr>
> > >     [MBP  5,3 2009  nvidia 9400M + 9600M GT   pre-retina]
> > > Tested-by: Paul Hordiienko <pvt.gord at gmail.com>
> > >     [MBP  6,2 2010  intel ILK + nvidia GT216  pre-retina]
> > > Tested-by: William Brown <william at blackhats.net.au>
> > >     [MBP  8,2 2011  intel SNB + amd turks     pre-retina]
> > > Tested-by: Lukas Wunner <lukas at wunner.de>
> > >     [MBP  9,1 2012  intel IVB + nvidia GK107  pre-retina]
> > > Tested-by: Bruno Bierbaumer <bruno at bierbaumer.net>
> > >     [MBP 11,3 2013  intel HSW + nvidia GK107  retina]
> > > 
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > Fixes: 60a5ca015ffd ("drm/i915: Add locking around
> > >     framebuffer_references--")
> > > Reported-by: Lukas Wunner <lukas at wunner.de>
> > > [Lukas: Create v3 + v4 + v5 based on Tvrtko's v2]
> > > Signed-off-by: Lukas Wunner <lukas at wunner.de>
> > > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > Cc: Jani Nikula <jani.nikula at intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/intel_fbdev.c | 20 ++++++++++++--------
> > >  1 file changed, 12 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> > > index 96476d7..eee3306 100644
> > > --- a/drivers/gpu/drm/i915/intel_fbdev.c
> > > +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> > > @@ -119,7 +119,7 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
> > >  {
> > >  	struct intel_fbdev *ifbdev =
> > >  		container_of(helper, struct intel_fbdev, helper);
> > > -	struct drm_framebuffer *fb;
> > > +	struct drm_framebuffer *fb = NULL;
> > >  	struct drm_device *dev = helper->dev;
> > >  	struct drm_mode_fb_cmd2 mode_cmd = {};
> > >  	struct drm_i915_gem_object *obj;
> > > @@ -137,6 +137,8 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
> > >  	mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
> > >  							  sizes->surface_depth);
> > >  
> > > +	mutex_lock(&dev->struct_mutex);
> > > +
> > >  	size = mode_cmd.pitches[0] * mode_cmd.height;
> > >  	size = PAGE_ALIGN(size);
> > >  	obj = i915_gem_object_create_stolen(dev, size);
> > > @@ -158,18 +160,21 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
> > >  	ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL, NULL, NULL);
> > >  	if (ret) {
> > >  		DRM_ERROR("failed to pin obj: %d\n", ret);
> > > -		goto out_fb;
> > > +		goto out_unref;
> > >  	}
> > >  
> > > +	mutex_unlock(&dev->struct_mutex);
> > > +
> > >  	ifbdev->fb = to_intel_framebuffer(fb);
> > >  
> > >  	return 0;
> > >  
> > > -out_fb:
> > > -	drm_framebuffer_remove(fb);
> > >  out_unref:
> > >  	drm_gem_object_unreference(&obj->base);
> > 
> > If fb init succeeded it took over the ref, no? So drm_framebuffer_remove()
> > will now attempt to unref one too many times.
> > 
> > This taking over refs stuff is confusing. Maybe it would be better if
> > everyone just took an extra ref when they stash the obj pointer
> > somewhere, and everyone would then always release whatever ref they own
> > and no longer need.
> > 
> > >  out:
> > > +	mutex_unlock(&dev->struct_mutex);
> > > +	if (fb)
> > > +		drm_framebuffer_remove(fb);
> > >  	return ret;
> > >  }
> > >  
> 
> Hm, why do you think we unref one too many times?

Because the fb now owns the reference, so it gets unreffed by the fb
.destroy() hook... I think.

> 
> A bit further up in this function we call __intel_framebuffer_create()
> which sets the refcount to 1. (It calls intel_framebuffer_init(), which
> calls drm_framebuffer_init(), which calls kref_init(&fb->refcount).)
> 
> So if intel_pin_and_fence_fb_obj() fails, we do need to unreference and
> tear down the fb. Thus, drm_framebuffer_remove() seems right here to me.

I wasn't complaining about the fb unref, but the bo unref.

> 
> However, because of your objection I've noticed now that "if (fb)" seems
> to be wrong, I think this should be "if (!IS_ERR_OR_NULL(fb))".
> 
> Because if __intel_framebuffer_create() failed, fb will be a PTR_ERR(),
> so not null, and we'd call drm_framebuffer_remove() on this. Is that
> what you meant?

No, but that's a good observation too.

> 
> 
> > > @@ -187,8 +192,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > >  	int size, ret;
> > >  	bool prealloc = false;
> > >  
> > > -	mutex_lock(&dev->struct_mutex);
> > > -
> > >  	if (intel_fb &&
> > >  	    (sizes->fb_width > intel_fb->base.width ||
> > >  	     sizes->fb_height > intel_fb->base.height)) {
> > > @@ -203,7 +206,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > >  		DRM_DEBUG_KMS("no BIOS fb, allocating a new one\n");
> > >  		ret = intelfb_alloc(helper, sizes);
> > >  		if (ret)
> > > -			goto out_unlock;
> > > +			return ret;
> > >  		intel_fb = ifbdev->fb;
> > >  	} else {
> > >  		DRM_DEBUG_KMS("re-using BIOS fb\n");
> > > @@ -215,6 +218,8 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > >  	obj = intel_fb->obj;
> > >  	size = obj->base.size;
> > >  
> > > +	mutex_lock(&dev->struct_mutex);
> > > +
> > 
> > I'm thinking we won't even need the lock here anymore. But maybe I'm
> > missing something.
> > 
> > >  	info = drm_fb_helper_alloc_fbi(helper);
> > >  	if (IS_ERR(info)) {
> > >  		ret = PTR_ERR(info);
> > > @@ -276,7 +281,6 @@ out_destroy_fbi:
> > >  out_unpin:
> > >  	i915_gem_object_ggtt_unpin(obj);
> > >  	drm_gem_object_unreference(&obj->base);
> > 
> > And this ref we don't own either AFAICS.
> 
> Why? We did call intelfb_alloc() above, so if something subsequently
> goes wrong, we need to revert the steps that intelfb_alloc() carried
> out. The drm_gem_object_unreference() therefore seems right here to me.

Here too the fb (if succesfully created) now owns that reference.

> 
> However I'm puzzled why we don't call drm_framebuffer_remove() under
> the out_unpin: label. Aren't we leaking a framebuffer here without that?
> 
> Maybe you're referring to the fact that this function either inherits
> the BIOS fb or creates a new fb with intelfb_alloc(). I'm not sure if
> the cleanup on error is identical in these two cases. Maybe you meant
> that we don't own the ref in the case that the fb was inherited from
> BIOS?
> 
> 
> Best regards,
> 
> Lukas
> 
> > 
> > > -out_unlock:
> > >  	mutex_unlock(&dev->struct_mutex);
> > >  	return ret;
> > >  }
> > > -- 
> > > 2.1.0
> > 
> > -- 
> > Ville Syrjälä
> > Intel OTC

-- 
Ville Syrjälä
Intel OTC


More information about the Intel-gfx mailing list