[PATCH] drm/i915/gvt: Add missing vfio_unregister_group_dev() call
Tian, Kevin
kevin.tian at intel.com
Wed Oct 19 10:13:39 UTC 2022
> From: Wang, Zhi A <zhi.a.wang at intel.com>
> Sent: Wednesday, October 19, 2022 5:41 PM
>
> On 10/6/22 18:31, Alex Williamson wrote:
> > On Thu, 6 Oct 2022 08:37:09 -0300
> > Jason Gunthorpe <jgg at nvidia.com> wrote:
> >
> >> On Wed, Oct 05, 2022 at 04:03:56PM -0600, Alex Williamson wrote:
> >>> We can't have a .remove callback that does nothing, this breaks
> >>> removing the device while it's in use. Once we have the
> >>> vfio_unregister_group_dev() fix below, we'll block until the device is
> >>> unused, at which point vgpu->attached becomes false. Unless I'm
> >>> missing something, I think we should also follow-up with a patch to
> >>> remove that bogus warn-on branch, right? Thanks,
> >>
> >> Yes, looks right to me.
> >>
> >> I question all the logical arround attached, where is the locking?
> >
> > Zhenyu, Zhi, Kevin,
> >
> > Could someone please take a look at use of vgpu->attached in the GVT-g
> > driver? It's use in intel_vgpu_remove() is bogus, the .release
> > callback needs to use vfio_unregister_group_dev() to wait for the
> > device to be unused. The WARN_ON/return here breaks all future use of
> > the device. I assume @attached has something to do with the page table
> > interface with KVM, but it all looks racy anyway.
> >
> Thanks for pointing this out.
>
> It was introduced in the GVT-g refactor patch series and Christoph might
> not want to touch the vgpu->released while he needed a new state.
>
> I dig it a bit. vgpu->attached would be used for preventing multiple open
> on a single vGPU and indicate the kvm_get_kvm() has been done.
vfio core already ensures that .open_device() is called only once:
vfio_device_open()
{
...
mutex_lock(&device->dev_set->lock);
device->open_count++;
if (device->open_count == 1) {
...
if (device->ops->open_device) {
ret = device->ops->open_device(device);
...
}
> vgpu->released was to prevent the release before close, which is now
> handled by the vfio_device_*.
>
> What I would like to do are:
> 1) Remove the vgpu->released. 2) Use alock to protect vgpu->attached.
>
> After those were solved, the WARN_ON/return in the intel_vgpu_remove()
> should be safely removed as the .release will be called after .close_device
> deceases the vfio_device->refcnt to zero.
>
> Thanks,
> Zhi.
>
> > Also, whatever purpose vgpu->released served looks unnecessary now.
> > Thanks,
> >
> > Alex
> >
More information about the dri-devel
mailing list