[Intel-gfx] [PATCH] drm/i915: Hold a RPM reference during i915_driver_unload

Joonas Lahtinen joonas.lahtinen at linux.intel.com
Thu Dec 31 03:27:01 PST 2015


On to, 2015-12-31 at 12:44 +0200, Gabriel Feceoru wrote:
> 
> On 30.12.2015 15:03, Joonas Lahtinen wrote:
> > Hi

<SNIP>

> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -1136,6 +1136,8 @@ int i915_driver_unload(struct drm_device
> > > *dev)
> > >   	struct drm_i915_private *dev_priv = dev->dev_private;
> > >   	int ret;
> > > 
> > > +	intel_runtime_pm_get(dev_priv);
> > > +
> > >   	intel_fbdev_fini(dev);
> > > 
> > >   	i915_audio_component_cleanup(dev_priv);
> > > @@ -1143,6 +1145,7 @@ int i915_driver_unload(struct drm_device
> > > *dev)
> > >   	ret = i915_gem_suspend(dev);
> > >   	if (ret) {
> > >   		DRM_ERROR("failed to idle hardware: %d\n",
> > > ret);
> > > +		intel_runtime_pm_put(dev_priv);
> > 
> > This should be made into goto construct.
> I cannot have one instance only of intel_runtime_put() at the end of
> the 
> function, since one exit branch uses kfree(dev_priv) and the other 
> doesn't. Similar to i915_driver_load()
> If this comment refers just to the 'return ret' - this is a legacy
> issue 
> not related to this fix. Could be seen in many places
> (i915_driver_load 
> again), just check i915_driver_open() where the last 3 lines of code 
> could be replaced with 'return ret'

In that case a second path is made for erroring out of the function
(see i915_driver_load):
	...
	return 0;
label1:
	...
labelN:
	intel_runtime_pm_put(dev_priv)
	return ret;

> > 
> > >   		return ret;
> > >   	}
> > > 
> > > @@ -1221,6 +1224,9 @@ int i915_driver_unload(struct drm_device
> > > *dev)
> > >   	kmem_cache_destroy(dev_priv->vmas);
> > >   	kmem_cache_destroy(dev_priv->objects);
> > >   	pci_dev_put(dev_priv->bridge_dev);
> > > +
> > > +	intel_runtime_pm_put(dev_priv);
> > > +
> > 
> > Not sure if we should/can keep the runtime reference until this
> > point.
> > At worst this could lead into the runtime_pm_put function poking at
> > the
> > hardware registers after the pci_dev has been released.
> > 
> > Also if we change the hangcheck task to execute depending on the
> > runtime_pm count, this will surely cause trouble. Added Imre as CC
> > to
> > comment on this.
> I'm not sure either, but again, the same is done in
> i915_driver_load().

Except that i915_driver_load is the function during which autosuspend
is enabled by calling intel_runtime_pm_enable so it's quite a different
animal. But I think that function has it incorrectly too, the return
path for no errors, _put is correct, but the return path for errors,
the _put should really be _put_noidle because the whole RPM thing was
torn down already see below for explanation.

The load/unload sequences are really not in sync (compare the _unload
function to the _load function teardown path for example), and in
addition to that it is not exactly clear (without going through all the
functions one by one) which functions might end up altering the
hardware state and do require the wakeref and which are tearing down
pure software constructs, so I might be off here, but this is what it
looks like to me.

I feel uneasy for the fact we're calling runtime_pm_put after releasing
the PCI device which basically should be the last put and which should
end up suspending the device when there is no device being held
anymore.

Also, in i915_driver_unload function the PM system teardown has already
been performed by intel_power_domains_fini way earlier in the function,
after which I think we can call at most runtime_pm_put_noidle because
essentially the device has been shut down already. 

I do not think those error paths have been stressed much.

Regards, Joonas

> Waiting for Imre's feedback on this.
> > 
> > >   	kfree(dev_priv);
> > > 
> > > 	return 0;
> > 
> > Insert goto label around here and make it "return ret;".
> > 
> > Regards, Joonas
> > 
> > > 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation



More information about the Intel-gfx mailing list