[Intel-gfx] [PATCH 2/6] drm/drv: Prepare to remove drm_dev_unplug()
Daniel Vetter
daniel at ffwll.ch
Tue Feb 5 16:31:44 UTC 2019
On Tue, Feb 05, 2019 at 11:20:55AM +0100, Noralf Trønnes wrote:
>
>
> Den 05.02.2019 10.11, skrev Daniel Vetter:
> > On Mon, Feb 04, 2019 at 06:35:28PM +0100, Noralf Trønnes wrote:
> >>
> >>
> >> Den 04.02.2019 16.41, skrev Daniel Vetter:
> >>> On Sun, Feb 03, 2019 at 04:41:56PM +0100, Noralf Trønnes wrote:
> >>>> The only thing now that makes drm_dev_unplug() special is that it sets
> >>>> drm_device->unplugged. Move this code to drm_dev_unregister() so that we
> >>>> can remove drm_dev_unplug().
> >>>>
> >>>> Signed-off-by: Noralf Trønnes <noralf at tronnes.org>
> >>>> ---
> >>
> >> [...]
> >>
> >>>> drivers/gpu/drm/drm_drv.c | 27 +++++++++++++++------------
> >>>> include/drm/drm_drv.h | 10 ++++------
> >>>> 2 files changed, 19 insertions(+), 18 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>> index 05bbc2b622fc..e0941200edc6 100644
> >>>> --- a/drivers/gpu/drm/drm_drv.c
> >>>> +++ b/drivers/gpu/drm/drm_drv.c
> >>>> @@ -366,15 +366,6 @@ EXPORT_SYMBOL(drm_dev_exit);
> >>>> */
> >>>> void drm_dev_unplug(struct drm_device *dev)
> >>>> {
> >>>> - /*
> >>>> - * After synchronizing any critical read section is guaranteed to see
> >>>> - * the new value of ->unplugged, and any critical section which might
> >>>> - * still have seen the old value of ->unplugged is guaranteed to have
> >>>> - * finished.
> >>>> - */
> >>>> - dev->unplugged = true;
> >>>> - synchronize_srcu(&drm_unplug_srcu);
> >>>> -
> >>>> drm_dev_unregister(dev);
> >>>> drm_dev_put(dev);
> >>>> }
> >>>> @@ -832,11 +823,14 @@ EXPORT_SYMBOL(drm_dev_register);
> >>>> * drm_dev_register() but does not deallocate the device. The caller must call
> >>>> * drm_dev_put() to drop their final reference.
> >>>> *
> >>>> - * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
> >>>> - * which can be called while there are still open users of @dev.
> >>>> + * This function can be called while there are still open users of @dev as long
> >>>> + * as the driver protects its device resources using drm_dev_enter() and
> >>>> + * drm_dev_exit().
> >>>> *
> >>>> * This should be called first in the device teardown code to make sure
> >>>> - * userspace can't access the device instance any more.
> >>>> + * userspace can't access the device instance any more. Drivers that support
> >>>> + * device unplug will probably want to call drm_atomic_helper_shutdown() first
> >>>
> >>> Read once more with a bit more coffee, spotted this:
> >>>
> >>> s/first/afterwards/ - shutting down the hw before we've taken it away from
> >>> userspace is kinda the wrong way round. It should be the inverse of driver
> >>> load, which is 1) allocate structures 2) prep hw 3) register driver with
> >>> the world (simplified ofc).
> >>>
> >>
> >> The problem is that drm_dev_unregister() sets the device as unplugged
> >> and if drm_atomic_helper_shutdown() is called afterwards it's not
> >> allowed to touch hardware.
> >>
> >> I know it's the wrong order, but the only way to do it in the right
> >> order is to have a separate function that sets unplugged:
> >>
> >> drm_dev_unregister();
> >> drm_atomic_helper_shutdown();
> >> drm_dev_set_unplugged();
> >
> > Annoying ... but yeah calling _shutdown() before we stopped userspace is
> > also not going to work. Because userspace could quickly re-enable
> > something, and then the refcounts would be all wrong again and leaking
> > objects.
> >
>
> What happens with a USB device that is unplugged with open userspace,
> will that leak objects?
Maybe we've jumped to conclusions. drm_atomic_helper_shutdown() will run
as normal, the only thing that should be skipped is actually touching the
hw (as long as the driver doesn't protect too much with
drm_dev_enter/exit). So all the software updates (including refcounting
updates) will still be done. Ofc current udl is not yet atomic, so in
reality something else happens.
And we ofc still have the same issue that if you just unload the driver,
then the hw will stay on (which might really confuse the driver on next
load, when it assumes that it only gets loaded from cold boot where
everything is off - which usually is the case on an arm soc at least).
> > I get a bit the feeling we're over-optimizing here with trying to devm-ize
> > drm_dev_register. Just getting drm_device correctly devm-ized is a big
> > step forward already, and will open up a lot of TODO items across a lot of
> > drivers. E.g. we could add a drm_dev_kzalloc, for allocating all the drm_*
> > structs, which gets released together with drm_device. I think that's a
> > much clearer path forward, I think we all agree that getting the kfree out
> > of the driver codes is a good thing, and it would allow us to do this
> > correctly.
> >
> > Then once we have that and rolled out to a few drivers we can reconsider
> > the entire unregister/shutdown gordian knot here. Atm I just have no idea
> > how to do this properly :-/
> >
> > Thoughts, other ideas?
> >
>
> Yeah, I've come to the conclusion that devm_drm_dev_register() doesn't
> make much sense if we need a driver remove callback anyways.
Yup, that's what I meant with the above:
- merge devm_drm_dev_register, use it a lot. This is definitely a good
idea.
- create a drm_dev_kzalloc, which automatically releases memory on the
final drm_dev_put. Use it every in drivers for drm structures,
especially in those drivers that currently use devm (which releases
those allocations potentialy too early on unplug).
- figure out the next steps after that
> I think devm_drm_dev_init() makes sense because it yields a cleaner
> probe() function. An additional benefit is that it requires a
> drm_driver->release function which is a step in the right direction to
> get the drm_device lifetime right.
>
> Do we agree that a drm_dev_set_unplugged() function is necessary to get
> the remove/disconnect order right?
>
> What about drm_dev_unplug() maybe I should just leave it be?
>
> - amd uses drm_driver->unload, so that one takes some work to get right
> to support unplug. It doesn't check the unplugged state, so really
> doesn't need drm_dev_unplug() I guess. Do they have cards that can be
> hotplugged?
Yeah amd still uses ->load and ->unload, which is not great unfortunately.
I just stumbled over that when I tried to make a series to disable the
global drm_global_mutex for most drivers ...
> - udl uses drm_driver->unload, doesn't use drm_atomic_helper_shutdown().
> It has only one drm_dev_is_unplugged() check and that is in its
> fbdev->open callback.
udl isn't atomic, so can't use the atomic helpers. pre-atomic doesn't have
refcounting issues when something is left on iirc. I think udl is written
under the assumption that ->unload is always called for an unplug, never
for an actual unload.
> - xen calls drm_atomic_helper_shutdown() in its drm_driver->release
> callback which I guess is not correct.
Yeah this smells fishy. ->release is supposed to be for cleaning up kernel
structures, not for cleaning up the hw. So maybe drm_mode_config_cleanup
could be put there, not sure honestly. But definitely not _shutdown().
> This is what it will look like with a set unplugged function:
>
> void drm_dev_unplug(struct drm_device *dev)
> {
> drm_dev_set_unplugged(dev);
> drm_dev_unregister(dev);
> drm_dev_put(dev);
> }
>
> Hm, I should probably remove it to avoid further use of it since it's
> not correct for a modern driver.
I think something flew over my head ... what's the drm_dev_set_unplugged
for? I think I'm not following you ...
Thanks, Daniel
>
> Noralf.
>
> > Cheers, Daniel
> >
> >> Noralf.
> >>
> >>>> + * in order to disable the hardware on regular driver module unload.
> >>>> */
> >>>> void drm_dev_unregister(struct drm_device *dev)
> >>>> {
> >>>> @@ -845,6 +839,15 @@ void drm_dev_unregister(struct drm_device *dev)
> >>>> if (drm_core_check_feature(dev, DRIVER_LEGACY))
> >>>> drm_lastclose(dev);
> >>>>
> >>>> + /*
> >>>> + * After synchronizing any critical read section is guaranteed to see
> >>>> + * the new value of ->unplugged, and any critical section which might
> >>>> + * still have seen the old value of ->unplugged is guaranteed to have
> >>>> + * finished.
> >>>> + */
> >>>> + dev->unplugged = true;
> >>>> + synchronize_srcu(&drm_unplug_srcu);
> >>>> +
> >>>> dev->registered = false;
> >>>>
> >>>> drm_client_dev_unregister(dev);
> >>>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> >>>> index ca46a45a9cce..c50696c82a42 100644
> >>>> --- a/include/drm/drm_drv.h
> >>>> +++ b/include/drm/drm_drv.h
> >>>> @@ -736,13 +736,11 @@ void drm_dev_unplug(struct drm_device *dev);
> >>>> * drm_dev_is_unplugged - is a DRM device unplugged
> >>>> * @dev: DRM device
> >>>> *
> >>>> - * This function can be called to check whether a hotpluggable is unplugged.
> >>>> - * Unplugging itself is singalled through drm_dev_unplug(). If a device is
> >>>> - * unplugged, these two functions guarantee that any store before calling
> >>>> - * drm_dev_unplug() is visible to callers of this function after it completes
> >>>> + * This function can be called to check whether @dev is unregistered. This can
> >>>> + * be used to detect that the underlying parent device is gone.
> >>>
> >>> I think it'd be good to keep the first part, and just update the reference
> >>> to drm_dev_unregister. So:
> >>>
> >>> * This function can be called to check whether a hotpluggable is unplugged.
> >>> * Unplugging itself is singalled through drm_dev_unregister(). If a device is
> >>> * unplugged, these two functions guarantee that any store before calling
> >>> * drm_dev_unregister() is visible to callers of this function after it
> >>> * completes.
> >>>
> >>> I think your version shrugs a few important details under the rug. With
> >>> those nits addressed:
> >>>
> >>> Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> >>>
> >>> Cheers, Daniel
> >>>
> >>>> *
> >>>> - * WARNING: This function fundamentally races against drm_dev_unplug(). It is
> >>>> - * recommended that drivers instead use the underlying drm_dev_enter() and
> >>>> + * WARNING: This function fundamentally races against drm_dev_unregister(). It
> >>>> + * is recommended that drivers instead use the underlying drm_dev_enter() and
> >>>> * drm_dev_exit() function pairs.
> >>>> */
> >>>> static inline bool drm_dev_is_unplugged(struct drm_device *dev)
> >>>> --
> >>>> 2.20.1
> >>>>
> >>>> _______________________________________________
> >>>> Intel-gfx mailing list
> >>>> Intel-gfx at lists.freedesktop.org
> >>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >>>
> >
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the amd-gfx
mailing list