[Intel-gfx] [PATCH v3] vfio: fix deadlock between group lock and kvm lock

Matthew Rosato mjrosato at linux.ibm.com
Fri Feb 3 17:29:01 UTC 2023


On 2/3/23 10:19 AM, Alex Williamson wrote:
> On Fri, 3 Feb 2023 14:54:44 +0000
> "Liu, Yi L" <yi.l.liu at intel.com> wrote:
> 
>>> From: Alex Williamson <alex.williamson at redhat.com>
>>> Sent: Friday, February 3, 2023 9:50 PM
>>>
>>> On Fri, 3 Feb 2023 13:32:09 +0000
>>> "Liu, Yi L" <yi.l.liu at intel.com> wrote:
>>>   
>>>>> From: Tian, Kevin <kevin.tian at intel.com>
>>>>> Sent: Friday, February 3, 2023 10:00 AM
>>>>>  
>>>>>> From: Alex Williamson <alex.williamson at redhat.com>
>>>>>> Sent: Friday, February 3, 2023 7:13 AM
>>>>>>
>>>>>> On Thu, 2 Feb 2023 23:04:10 +0000
>>>>>> "Tian, Kevin" <kevin.tian at intel.com> wrote:
>>>>>>  
>>>>>>>> From: Alex Williamson <alex.williamson at redhat.com>
>>>>>>>> Sent: Friday, February 3, 2023 3:42 AM
>>>>>>>>
>>>>>>>>
>>>>>>>> LGTM.  I'm not sure moving the functions to vfio_main really buys  
>>> us  
>>>>>>>> anything since we're making so much use of group fields.  The cdev
>>>>>>>> approach will necessarily be different, so the bulk of the get code  
>>> will  
>>>>>>>> likely need to move back to group.c anyway.
>>>>>>>>  
>>>>>>>
>>>>>>> well my last comment was based on Matthew's v2 where the get  
>>> code  
>>>>>>> gets a kvm passed in instead of implicitly retrieving group ref_lock
>>>>>>> internally. In that case the get/put helpers only contain device logic
>>>>>>> thus fit in vfio_main.c.
>>>>>>>
>>>>>>> with v3 then they have to be in group.c since we don't want to use
>>>>>>> group fields in vfio_main.c.
>>>>>>>
>>>>>>> but I still think v2 of the helpers is slightly better. The only difference
>>>>>>> between cdev and group when handling this race is using different
>>>>>>> ref_lock. the symbol get/put part is exactly same. So even if we
>>>>>>> merge v3 like this, very likely Yi has to change it back to v2 style
>>>>>>> to share the get/put helpers while just leaving the ref_lock part
>>>>>>> handled differently between the two path.  
>>>>>>
>>>>>> I'm not really a fan of the asymmetry of the v2 version where the get
>>>>>> helper needs to be called under the new kvm_ref_lock, but the put
>>>>>> helper does not.  Having the get helper handle that makes the caller
>>>>>> much cleaner.  Thanks,
>>>>>>  
>>>>>
>>>>> What about passing the lock pointer into the helper? it's still slightly
>>>>> asymmetry as the put helper doesn't carry the lock pointer but it
>>>>> could also be interpreted as if the pointer has been saved in the get
>>>>> then if it needs to be referenced by the put there is no need to pass
>>>>> it in again.  
>>>>
>>>> For cdev, I may modify vfio_device_get_kvm_safe() to accept
>>>> struct kvm and let its caller hold a kvm_ref_lock (field within
>>>> struct vfio_device_file). Meanwhile, the group path holds
>>>> the group->kvm_ref_lock before invoking vfio_device_get_kvm_safe().
>>>> vfio_device_get_kvm_safe() just includes the symbol get/put and
>>>> the device->kvm and put_kvm set.  
>>>
>>> Sounds a lot like v2 :-\   
>>
>> Yes, like v2. 😊
>>
>>> I'd look more towards group and cdev specific
>>> helpers that handle the locking so that the callers aren't exposed to
>>> the asymmetry of get vs put, and reduce a new
>>> _vfio_device_get_kvm_safe() in common code that only does the symbol
>>> work.  Thanks,  
>>
>> If so, looks like Matthew needs a v4. I'm waiting for the final version
>> of this patch and sending a new cdev series based on it. wish to see
>> it soon ^_^.
> 
> cdev support is a future feature, why does it become a requirement for
> a fix to the current base?  The refactoring could also happen in the
> cdev series.  Thanks,
> 
> Alex
> 

FWIW, that would bloat the fix by a bit and it's already a decent-sized fix...  I took a quick stab and such a v4 would end up with a total of 120 insertions(+), 15 deletions(-).  See below for a diff of what I tried on top of v3. The idea would be to move the serialization and setting/clearing of device->kvm into group/cdev and use device->kvm as the value within vfio_device_get_kvm_safe for getting the kvm ref.  Wrappers in group/cdev would then be responsible for backing that device->kvm setting out on ref failure.
Each side (group/cdev) can wrap their own serialization / load device->kvm from the appropriate location / handle the failed get / clear device->kvm when done (since get doesn't set it, put shouldn't clear it).

If Alex wants, I can wrap something like this into a v4 -- Or, if we don't want to include this kind of infrastructure in the fix, then Yi please feel free to use it as a starting point for what you need in cdev.

Thanks,
Matt

diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 7fed4233ca23..261af23975ae 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -154,6 +154,31 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
 	return ret;
 }
 
+static void vfio_device_group_get_kvm(struct vfio_device *device)
+{
+	lockdep_assert_held(&device->dev_set->lock);
+
+	spin_lock(&device->group->kvm_ref_lock);
+
+	if (!device->group->kvm)
+		goto unlock;
+
+	device->kvm = device->group->kvm;
+	if (!vfio_device_get_kvm_safe(device))
+		device->kvm = NULL;
+
+unlock:
+	spin_unlock(&device->group->kvm_ref_lock);
+}
+
+static void vfio_device_group_put_kvm(struct vfio_device *device)
+{
+	lockdep_assert_held(&device->dev_set->lock);
+
+	vfio_device_put_kvm(device);
+	device->kvm = NULL;
+}
+
 static int vfio_device_group_open(struct vfio_device *device)
 {
 	int ret;
@@ -173,12 +198,12 @@ static int vfio_device_group_open(struct vfio_device *device)
 	 * the pointer in the device for use by drivers.
 	 */
 	if (device->open_count == 0)
-		vfio_device_get_kvm_safe(device);
+		vfio_device_group_get_kvm(device);
 
 	ret = vfio_device_open(device, device->group->iommufd, device->kvm);
 
 	if (device->open_count == 0)
-		vfio_device_put_kvm(device);
+		vfio_device_group_put_kvm(device);
 
 	mutex_unlock(&device->dev_set->lock);
 
@@ -195,7 +220,7 @@ void vfio_device_group_close(struct vfio_device *device)
 	vfio_device_close(device, device->group->iommufd);
 
 	if (device->open_count == 0)
-		vfio_device_put_kvm(device);
+		vfio_device_group_put_kvm(device);
 
 	mutex_unlock(&device->dev_set->lock);
 	mutex_unlock(&device->group->group_lock);
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 20d715b0a3a8..57b24931c742 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -253,11 +253,12 @@ enum { vfio_noiommu = false };
 #endif
 
 #ifdef CONFIG_HAVE_KVM
-void vfio_device_get_kvm_safe(struct vfio_device *device);
+bool vfio_device_get_kvm_safe(struct vfio_device *device);
 void vfio_device_put_kvm(struct vfio_device *device);
 #else
-static inline void vfio_device_get_kvm_safe(struct vfio_device *device)
+static inline bool vfio_device_get_kvm_safe(struct vfio_device *device)
 {
+	return false;
 }
 
 static inline void vfio_device_put_kvm(struct vfio_device *device)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 4762550e9f42..0b8fd296ae7e 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -342,7 +342,7 @@ void vfio_unregister_group_dev(struct vfio_device *device)
 EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
 
 #ifdef CONFIG_HAVE_KVM
-void vfio_device_get_kvm_safe(struct vfio_device *device)
+bool vfio_device_get_kvm_safe(struct vfio_device *device)
 {
 	void (*pfn)(struct kvm *kvm);
 	bool (*fn)(struct kvm *kvm);
@@ -350,32 +350,26 @@ void vfio_device_get_kvm_safe(struct vfio_device *device)
 
 	lockdep_assert_held(&device->dev_set->lock);
 
-	spin_lock(&device->group->kvm_ref_lock);
-	if (!device->group->kvm)
-		goto unlock;
-
 	pfn = symbol_get(kvm_put_kvm);
 	if (WARN_ON(!pfn))
-		goto unlock;
+		return false;
 
 	fn = symbol_get(kvm_get_kvm_safe);
 	if (WARN_ON(!fn)) {
 		symbol_put(kvm_put_kvm);
-		goto unlock;
+		return false;
 	}
 
-	ret = fn(device->group->kvm);
+	ret = fn(device->kvm);
 	symbol_put(kvm_get_kvm_safe);
 	if (!ret) {
 		symbol_put(kvm_put_kvm);
-		goto unlock;
+		return false;
 	}
 
 	device->put_kvm = pfn;
-	device->kvm = device->group->kvm;
 
-unlock:
-	spin_unlock(&device->group->kvm_ref_lock);
+	return true;
 }
 
 void vfio_device_put_kvm(struct vfio_device *device)
@@ -386,14 +380,11 @@ void vfio_device_put_kvm(struct vfio_device *device)
 		return;
 
 	if (WARN_ON(!device->put_kvm))
-		goto clear;
+		return;
 
 	device->put_kvm(device->kvm);
 	device->put_kvm = NULL;
 	symbol_put(kvm_put_kvm);
-
-clear:
-	device->kvm = NULL;
 }
 #endif
 



More information about the Intel-gfx mailing list