[Intel-gfx] [PATCH V2 5/8] mdev: introduce device specific ops
Alex Williamson
alex.williamson at redhat.com
Thu Sep 26 16:26:08 UTC 2019
On Thu, 26 Sep 2019 11:46:55 -0400
"Michael S. Tsirkin" <mst at redhat.com> wrote:
> On Wed, Sep 25, 2019 at 10:30:28AM -0600, Alex Williamson wrote:
> > On Wed, 25 Sep 2019 10:11:00 -0400
> > Rob Miller <rob.miller at broadcom.com> wrote:
> > > > > On Tue, 24 Sep 2019 21:53:29 +0800
> > > > > Jason Wang <jasowang at redhat.com> wrote:
> > > > > > diff --git a/drivers/vfio/mdev/vfio_mdev.c
> > > > > b/drivers/vfio/mdev/vfio_mdev.c
> > > > > > index 891cf83a2d9a..95efa054442f 100644
> > > > > > --- a/drivers/vfio/mdev/vfio_mdev.c
> > > > > > +++ b/drivers/vfio/mdev/vfio_mdev.c
> > > > > > @@ -14,6 +14,7 @@
> > > > > > #include <linux/slab.h>
> > > > > > #include <linux/vfio.h>
> > > > > > #include <linux/mdev.h>
> > > > > > +#include <linux/vfio_mdev.h>
> > > > > >
> > > > > > #include "mdev_private.h"
> > > > > >
> > > > > > @@ -24,16 +25,16 @@
> > > > > > static int vfio_mdev_open(void *device_data)
> > > > > > {
> > > > > > struct mdev_device *mdev = device_data;
> > > > > > - struct mdev_parent *parent = mdev->parent;
> > > > > > + const struct vfio_mdev_device_ops *ops =
> > > > > mdev_get_dev_ops(mdev);
> > > > > > int ret;
> > > > > >
> > > > > > - if (unlikely(!parent->ops->open))
> > > > > > + if (unlikely(!ops->open))
> > > > > > return -EINVAL;
> > > > > >
> > > > > > if (!try_module_get(THIS_MODULE))
> > > > > > return -ENODEV;
> > > >
> > >
> > > RJM>] My understanding lately is that this call to
> > > try_module_get(THIS_MODULE) is no longer needed as is considered as a
> > > latent bug.
> > > Quote from
> > > https://stackoverflow.com/questions/1741415/linux-kernel-modules-when-to-use-try-module-get-module-put
> > > :
> > > There are a number of uses of try_module_get(THIS_MODULE) in the kernel
> > > source but most if not all of them are latent bugs that should be cleaned
> > > up.
> >
> > This use seems to fall exactly into the case where it is necessary, the
> > open here is not a direct VFS call, it's an internal interface between
> > modules. The user is interacting with filesystem objects from the vfio
> > module and the module reference we're trying to acquire here is to the
> > vfio-mdev module. Thanks,
> >
> > Alex
>
>
> I think the latent bug refers not to module get per se,
> but to the module_put tied to it. E.g.:
>
> static void vfio_mdev_release(void *device_data)
> {
> struct mdev_device *mdev = device_data;
> struct mdev_parent *parent = mdev->parent;
>
> if (likely(parent->ops->release))
> parent->ops->release(mdev);
>
> module_put(THIS_MODULE);
>
> Does anything prevent the module from unloading at this point?
> if not then ...
>
>
> }
>
> it looks like the implicit return (with instructions for argument pop
> and functuon return) here can get overwritten on module
> unload, causing a crash when executed.
>
> IOW there's generally no way for module to keep a reference
> to itself: it can take a reference but it needs someone else
> to keep it and put.
I'd always assumed this would exit cleanly, but perhaps there is a
latent race there. In any case, taking a module reference within the
module in this case is better than not doing so, as the latter would
potentially allow the module to be removed at any point in time, while
the former only seems to expose acquire and release gaps. Add it to
the todo list. Thanks,
Alex
More information about the Intel-gfx
mailing list