[PATCH V5 4/6] mdev: introduce virtio device and its device ops

Jason Wang jasowang at redhat.com
Fri Oct 25 01:54:20 UTC 2019


On 2019/10/25 上午4:44, Alex Williamson wrote:
> On Thu, 24 Oct 2019 11:51:35 +0800
> Jason Wang<jasowang at redhat.com>  wrote:
>
>> On 2019/10/24 上午5:57, Alex Williamson wrote:
>>> On Wed, 23 Oct 2019 21:07:50 +0800
>>> Jason Wang<jasowang at redhat.com>  wrote:
>>>   
>>>> This patch implements basic support for mdev driver that supports
>>>> virtio transport for kernel virtio driver.
>>>>
>>>> Signed-off-by: Jason Wang<jasowang at redhat.com>
>>>> ---
>>>>    drivers/vfio/mdev/mdev_core.c    |  20 ++++
>>>>    drivers/vfio/mdev/mdev_private.h |   2 +
>>>>    include/linux/mdev.h             |   6 ++
>>>>    include/linux/virtio_mdev_ops.h  | 159 +++++++++++++++++++++++++++++++
>>>>    4 files changed, 187 insertions(+)
>>>>    create mode 100644 include/linux/virtio_mdev_ops.h
>>>>
>>>> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
>>>> index 555bd61d8c38..9b00c3513120 100644
>>>> --- a/drivers/vfio/mdev/mdev_core.c
>>>> +++ b/drivers/vfio/mdev/mdev_core.c
>>>> @@ -76,6 +76,26 @@ const struct vfio_mdev_device_ops *mdev_get_vfio_ops(struct mdev_device *mdev)
>>>>    }
>>>>    EXPORT_SYMBOL(mdev_get_vfio_ops);
>>>>    
>>>> +/* Specify the virtio device ops for the mdev device, this
>>>> + * must be called during create() callback for virtio mdev device.
>>>> + */
>>>> +void mdev_set_virtio_ops(struct mdev_device *mdev,
>>>> +			 const struct virtio_mdev_device_ops *virtio_ops)
>>>> +{
>>>> +	mdev_set_class(mdev, MDEV_CLASS_ID_VIRTIO);
>>>> +	mdev->virtio_ops = virtio_ops;
>>>> +}
>>>> +EXPORT_SYMBOL(mdev_set_virtio_ops);
>>>> +
>>>> +/* Get the virtio device ops for the mdev device. */
>>>> +const struct virtio_mdev_device_ops *
>>>> +mdev_get_virtio_ops(struct mdev_device *mdev)
>>>> +{
>>>> +	WARN_ON(mdev->class_id != MDEV_CLASS_ID_VIRTIO);
>>>> +	return mdev->virtio_ops;
>>>> +}
>>>> +EXPORT_SYMBOL(mdev_get_virtio_ops);
>>>> +
>>>>    struct device *mdev_dev(struct mdev_device *mdev)
>>>>    {
>>>>    	return &mdev->dev;
>>>> diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
>>>> index 0770410ded2a..7b47890c34e7 100644
>>>> --- a/drivers/vfio/mdev/mdev_private.h
>>>> +++ b/drivers/vfio/mdev/mdev_private.h
>>>> @@ -11,6 +11,7 @@
>>>>    #define MDEV_PRIVATE_H
>>>>    
>>>>    #include <linux/vfio_mdev_ops.h>
>>>> +#include <linux/virtio_mdev_ops.h>
>>>>    
>>>>    int  mdev_bus_register(void);
>>>>    void mdev_bus_unregister(void);
>>>> @@ -38,6 +39,7 @@ struct mdev_device {
>>>>    	u16 class_id;
>>>>    	union {
>>>>    		const struct vfio_mdev_device_ops *vfio_ops;
>>>> +		const struct virtio_mdev_device_ops *virtio_ops;
>>>>    	};
>>>>    };
>>>>    
>>>> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
>>>> index 4625f1a11014..9b69b0bbebfd 100644
>>>> --- a/include/linux/mdev.h
>>>> +++ b/include/linux/mdev.h
>>>> @@ -17,6 +17,7 @@
>>>>    
>>>>    struct mdev_device;
>>>>    struct vfio_mdev_device_ops;
>>>> +struct virtio_mdev_device_ops;
>>>>    
>>>>    /*
>>>>     * Called by the parent device driver to set the device which represents
>>>> @@ -112,6 +113,10 @@ void mdev_set_class(struct mdev_device *mdev, u16 id);
>>>>    void mdev_set_vfio_ops(struct mdev_device *mdev,
>>>>    		       const struct vfio_mdev_device_ops *vfio_ops);
>>>>    const struct vfio_mdev_device_ops *mdev_get_vfio_ops(struct mdev_device *mdev);
>>>> +void mdev_set_virtio_ops(struct mdev_device *mdev,
>>>> +			 const struct virtio_mdev_device_ops *virtio_ops);
>>>> +const struct virtio_mdev_device_ops *
>>>> +mdev_get_virtio_ops(struct mdev_device *mdev);
>>>>    
>>>>    extern struct bus_type mdev_bus_type;
>>>>    
>>>> @@ -127,6 +132,7 @@ struct mdev_device *mdev_from_dev(struct device *dev);
>>>>    
>>>>    enum {
>>>>    	MDEV_CLASS_ID_VFIO = 1,
>>>> +	MDEV_CLASS_ID_VIRTIO = 2,
>>>>    	/* New entries must be added here */
>>>>    };
>>>>    
>>>> diff --git a/include/linux/virtio_mdev_ops.h b/include/linux/virtio_mdev_ops.h
>>>> new file mode 100644
>>>> index 000000000000..d417b41f2845
>>>> --- /dev/null
>>>> +++ b/include/linux/virtio_mdev_ops.h
>>>> @@ -0,0 +1,159 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>>> +/*
>>>> + * Virtio mediated device driver
>>>> + *
>>>> + * Copyright 2019, Red Hat Corp.
>>>> + *     Author: Jason Wang<jasowang at redhat.com>
>>>> + */
>>>> +#ifndef _LINUX_VIRTIO_MDEV_H
>>>> +#define _LINUX_VIRTIO_MDEV_H
>>>> +
>>>> +#include <linux/interrupt.h>
>>>> +#include <linux/mdev.h>
>>>> +#include <uapi/linux/vhost.h>
>>>> +
>>>> +#define VIRTIO_MDEV_DEVICE_API_STRING		"virtio-mdev"
>>>> +#define VIRTIO_MDEV_F_VERSION_1 0x1
>>>> +
>>>> +struct virtio_mdev_callback {
>>>> +	irqreturn_t (*callback)(void *data);
>>>> +	void *private;
>>>> +};
>>>> +
>>>> +/**
>>>> + * struct vfio_mdev_device_ops - Structure to be registered for each
>>>> + * mdev device to register the device for virtio/vhost drivers.
>>>> + *
>>>> + * The device ops that is supported by VIRTIO_MDEV_F_VERSION_1, the
>>>> + * callbacks are mandatory unless explicity mentioned.
>>> If the version of the callbacks is returned by a callback within the
>>> structure defined by the version... isn't that a bit circular?  This
>>> seems redundant to me versus the class id.  The fact that the parent
>>> driver defines the device as MDEV_CLASS_ID_VIRTIO should tell us this
>>> already.  If it was incremented, we'd need an MDEV_CLASS_ID_VIRTIOv2,
>>> which the virtio-mdev bus driver could add to its id table and handle
>>> differently.
>> My understanding is versions are only allowed to increase monotonically,
>> this seems less flexible than features. E.g we have features A, B, C,
>> mdev device can choose to support only a subset. E.g when mdev device
>> can support dirty page logging, it can add a new feature bit for driver
>> to know that it support new device ops. MDEV_CLASS_ID_VIRTIOv2 may only
>> be useful when we will invent a complete new API.
> But this interface rather conflates features and versions by returning
> a version as a feature.  If we simply want to say that there are no
> additional features, then get_mdev_features() should return an empty
> set.  If dirty page logging is a feature, then I'd expect a bit in the
> get_mdev_features() return value to identify that feature.
>
> However, I've been under the impression (perhaps wrongly) that the
> class-id has a 1:1 correlation to the device-ops exposed to the bus
> driver, so if dirty page logging requires extra callbacks, that would
> imply a new device-ops, which requires a new class-id.  In that case
> virtio-mdev would claim both class-ids and would need some way to
> differentiate them.  But I also see that such a solution can become
> unmanageable as the set of class-ids would need to encompass every
> combination of features.
>
> So I think what's suggested by this is that the initial struct
> virtio_mdev_device_ops is a base set of callbacks which would be
> extended via features?


Yes.


> But then why does get_generation() not make use
> of this?  And if we can define get_generation() as optional and simply
> test if it's implemented, then it seems we don't need any feature bits
> to extend the structure (unless we're looking at binary compatibility).


I think the mapping between ops and features is N:1. We start with a 
base feature like VIRTIO_MDEV_VERSION_1, this allows us to do simply 
checking during probe without the need to checking the existence of each 
op. Testing the existence of a specific op should work but less 
convenient when a feature requires several new ops. For the case of 
get_generation(), it works because the bus driver have workaround when 
parent doesn't provide that. This may not work for a real feature.


> So maybe get_mdev_features() is meant to expose underlying features
> unrelated to the callbacks?  Which is not in line with the description?


At least not for current virtio-mdev/vhost-mdev. Virtio specific 
features should be done via get_features().


> Hopefully you can see my confusion in what we're trying to do here.


I think I get you and hope my answer make sense. Suggestions are welcomed.

Thanks


> Thanks,
>
> Alex
>



More information about the dri-devel mailing list