[PATCH v2 1/3] dma-buf: Add ioctl to query mmap coherency/cache info

Daniel Vetter daniel at ffwll.ch
Wed Sep 7 16:43:10 UTC 2022


On Tue, Aug 16, 2022 at 06:50:54PM +0200, Christian König wrote:
> Am 16.08.22 um 16:26 schrieb Rob Clark:
> > On Tue, Aug 16, 2022 at 1:27 AM Christian König
> > <christian.koenig at amd.com> wrote:
> > > Am 15.08.22 um 23:15 schrieb Rob Clark:
> > > > From: Rob Clark <robdclark at chromium.org>
> > > > 
> > > > This is a fairly narrowly focused interface, providing a way for a VMM
> > > > in userspace to tell the guest kernel what pgprot settings to use when
> > > > mapping a buffer to guest userspace.
> > > > 
> > > > For buffers that get mapped into guest userspace, virglrenderer returns
> > > > a dma-buf fd to the VMM (crosvm or qemu).  In addition to mapping the
> > > > pages into the guest VM, it needs to report to drm/virtio in the guest
> > > > the cache settings to use for guest userspace.  In particular, on some
> > > > architectures, creating aliased mappings with different cache attributes
> > > > is frowned upon, so it is important that the guest mappings have the
> > > > same cache attributes as any potential host mappings.
> > > > 
> > > > Signed-off-by: Rob Clark <robdclark at chromium.org>
> > > > ---
> > > > v2: Combine with coherency, as that is a related concept.. and it is
> > > >       relevant to the VMM whether coherent access without the SYNC ioctl
> > > >       is possible; set map_info at export time to make it more clear
> > > >       that it applies for the lifetime of the dma-buf (for any mmap
> > > >       created via the dma-buf)
> > > Well, exactly that's a conceptual NAK from my side.
> > > 
> > > The caching information can change at any time. For CPU mappings even
> > > without further notice from the exporter.
> > You should look before you criticize, as I left you a way out.. the
> > idea was that DMA_BUF_MAP_INCOHERENT should indicate that the buffer
> > cannot be mapped to the guest.  We could ofc add more DMA_BUF_MAP_*
> > values if something else would suit you better.  But the goal is to
> > give the VMM enough information to dtrt, or return an error if mapping
> > to guest is not possible.  That seems better than assuming mapping to
> > guest will work and guessing about cache attrs
> 
> Well I'm not rejecting the implementation, I'm rejecting this from the
> conceptual point of view.
> 
> We intentional gave the exporter full control over the CPU mappings. This
> approach here breaks that now.
> 
> I haven't seen the full detailed reason why we should do that and to be
> honest KVM seems to mess with things it is not supposed to touch.
> 
> For example the page reference count of mappings marked with VM_IO is a
> complete no-go. This is a very strong evidence that this was based on rather
> dangerous halve knowledge about the background of the handling here.

Wut?

KVM grabs page references of VM_IO vma? I thought the issue was that we
still had some bo/dma-buf vma that didn't set either VM_IO or VM_PFNMAP,
and not that kvm was just outright breaking every core mm contract there
is.

Is this really what's going on in that other thread about "fixing" ttm?
-Daniel

> So as long as I don't see a full explanation why KVM is grabbing reference
> to pages while faulting them and why we manually need to forward the caching
> while the hardware documentation indicates otherwise I will be rejecting
> this whole approach.
> 
> Regards,
> Christian.
> 
> > 
> > BR,
> > -R
> > 
> > > If the hardware can't use the caching information from the host CPU page
> > > tables directly then that pretty much completely breaks the concept that
> > > the exporter is responsible for setting up those page tables.
> > > 
> > > Regards,
> > > Christian.
> > > 
> > > >    drivers/dma-buf/dma-buf.c    | 63 +++++++++++++++++++++++++++------
> > > >    include/linux/dma-buf.h      | 11 ++++++
> > > >    include/uapi/linux/dma-buf.h | 68 ++++++++++++++++++++++++++++++++++++
> > > >    3 files changed, 132 insertions(+), 10 deletions(-)
> > > > 
> > > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > > > index 32f55640890c..262c4706f721 100644
> > > > --- a/drivers/dma-buf/dma-buf.c
> > > > +++ b/drivers/dma-buf/dma-buf.c
> > > > @@ -125,6 +125,32 @@ static struct file_system_type dma_buf_fs_type = {
> > > >        .kill_sb = kill_anon_super,
> > > >    };
> > > > 
> > > > +static int __dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> > > > +{
> > > > +     int ret;
> > > > +
> > > > +     /* check if buffer supports mmap */
> > > > +     if (!dmabuf->ops->mmap)
> > > > +             return -EINVAL;
> > > > +
> > > > +     ret = dmabuf->ops->mmap(dmabuf, vma);
> > > > +
> > > > +     /*
> > > > +      * If the exporter claims to support coherent access, ensure the
> > > > +      * pgprot flags match the claim.
> > > > +      */
> > > > +     if ((dmabuf->map_info != DMA_BUF_MAP_INCOHERENT) && !ret) {
> > > > +             pgprot_t wc_prot = pgprot_writecombine(vma->vm_page_prot);
> > > > +             if (dmabuf->map_info == DMA_BUF_COHERENT_WC) {
> > > > +                     WARN_ON_ONCE(pgprot_val(vma->vm_page_prot) != pgprot_val(wc_prot));
> > > > +             } else {
> > > > +                     WARN_ON_ONCE(pgprot_val(vma->vm_page_prot) == pgprot_val(wc_prot));
> > > > +             }
> > > > +     }
> > > > +
> > > > +     return ret;
> > > > +}
> > > > +
> > > >    static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
> > > >    {
> > > >        struct dma_buf *dmabuf;
> > > > @@ -134,16 +160,12 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
> > > > 
> > > >        dmabuf = file->private_data;
> > > > 
> > > > -     /* check if buffer supports mmap */
> > > > -     if (!dmabuf->ops->mmap)
> > > > -             return -EINVAL;
> > > > -
> > > >        /* check for overflowing the buffer's size */
> > > >        if (vma->vm_pgoff + vma_pages(vma) >
> > > >            dmabuf->size >> PAGE_SHIFT)
> > > >                return -EINVAL;
> > > > 
> > > > -     return dmabuf->ops->mmap(dmabuf, vma);
> > > > +     return __dma_buf_mmap(dmabuf, vma);
> > > >    }
> > > > 
> > > >    static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
> > > > @@ -326,6 +348,27 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
> > > >        return 0;
> > > >    }
> > > > 
> > > > +static long dma_buf_info(struct dma_buf *dmabuf, void __user *uarg)
> > > > +{
> > > > +     struct dma_buf_info arg;
> > > > +
> > > > +     if (copy_from_user(&arg, uarg, sizeof(arg)))
> > > > +             return -EFAULT;
> > > > +
> > > > +     switch (arg.param) {
> > > > +     case DMA_BUF_INFO_MAP_INFO:
> > > > +             arg.value = dmabuf->map_info;
> > > > +             break;
> > > > +     default:
> > > > +             return -EINVAL;
> > > > +     }
> > > > +
> > > > +     if (copy_to_user(uarg, &arg, sizeof(arg)))
> > > > +             return -EFAULT;
> > > > +
> > > > +     return 0;
> > > > +}
> > > > +
> > > >    static long dma_buf_ioctl(struct file *file,
> > > >                          unsigned int cmd, unsigned long arg)
> > > >    {
> > > > @@ -369,6 +412,9 @@ static long dma_buf_ioctl(struct file *file,
> > > >        case DMA_BUF_SET_NAME_B:
> > > >                return dma_buf_set_name(dmabuf, (const char __user *)arg);
> > > > 
> > > > +     case DMA_BUF_IOCTL_INFO:
> > > > +             return dma_buf_info(dmabuf, (void __user *)arg);
> > > > +
> > > >        default:
> > > >                return -ENOTTY;
> > > >        }
> > > > @@ -530,6 +576,7 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
> > > >        dmabuf->priv = exp_info->priv;
> > > >        dmabuf->ops = exp_info->ops;
> > > >        dmabuf->size = exp_info->size;
> > > > +     dmabuf->map_info = exp_info->map_info;
> > > >        dmabuf->exp_name = exp_info->exp_name;
> > > >        dmabuf->owner = exp_info->owner;
> > > >        spin_lock_init(&dmabuf->name_lock);
> > > > @@ -1245,10 +1292,6 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
> > > >        if (WARN_ON(!dmabuf || !vma))
> > > >                return -EINVAL;
> > > > 
> > > > -     /* check if buffer supports mmap */
> > > > -     if (!dmabuf->ops->mmap)
> > > > -             return -EINVAL;
> > > > -
> > > >        /* check for offset overflow */
> > > >        if (pgoff + vma_pages(vma) < pgoff)
> > > >                return -EOVERFLOW;
> > > > @@ -1262,7 +1305,7 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
> > > >        vma_set_file(vma, dmabuf->file);
> > > >        vma->vm_pgoff = pgoff;
> > > > 
> > > > -     return dmabuf->ops->mmap(dmabuf, vma);
> > > > +     return __dma_buf_mmap(dmabuf, vma);
> > > >    }
> > > >    EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, DMA_BUF);
> > > > 
> > > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > > > index 71731796c8c3..37923c8d5c24 100644
> > > > --- a/include/linux/dma-buf.h
> > > > +++ b/include/linux/dma-buf.h
> > > > @@ -23,6 +23,8 @@
> > > >    #include <linux/dma-fence.h>
> > > >    #include <linux/wait.h>
> > > > 
> > > > +#include <uapi/linux/dma-buf.h>
> > > > +
> > > >    struct device;
> > > >    struct dma_buf;
> > > >    struct dma_buf_attachment;
> > > > @@ -307,6 +309,13 @@ struct dma_buf {
> > > >         */
> > > >        size_t size;
> > > > 
> > > > +     /**
> > > > +      * @map_info:
> > > > +      *
> > > > +      * CPU mapping/coherency information for the buffer.
> > > > +      */
> > > > +     enum dma_buf_map_info map_info;
> > > > +
> > > >        /**
> > > >         * @file:
> > > >         *
> > > > @@ -533,6 +542,7 @@ struct dma_buf_attachment {
> > > >     * @ops:    Attach allocator-defined dma buf ops to the new buffer
> > > >     * @size:   Size of the buffer - invariant over the lifetime of the buffer
> > > >     * @flags:  mode flags for the file
> > > > + * @map_info:        CPU mapping/coherency information for the buffer
> > > >     * @resv:   reservation-object, NULL to allocate default one
> > > >     * @priv:   Attach private data of allocator to this buffer
> > > >     *
> > > > @@ -545,6 +555,7 @@ struct dma_buf_export_info {
> > > >        const struct dma_buf_ops *ops;
> > > >        size_t size;
> > > >        int flags;
> > > > +     enum dma_buf_map_info map_info;
> > > >        struct dma_resv *resv;
> > > >        void *priv;
> > > >    };
> > > > diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
> > > > index b1523cb8ab30..07b403ffdb43 100644
> > > > --- a/include/uapi/linux/dma-buf.h
> > > > +++ b/include/uapi/linux/dma-buf.h
> > > > @@ -85,6 +85,72 @@ struct dma_buf_sync {
> > > > 
> > > >    #define DMA_BUF_NAME_LEN    32
> > > > 
> > > > +/**
> > > > + * enum dma_buf_map_info - CPU mapping info
> > > > + *
> > > > + * This enum describes coherency of a userspace mapping of the dmabuf.
> > > > + *
> > > > + * Importing devices should check dma_buf::map_info flag and reject an
> > > > + * import if unsupported.  For example, if the exporting device uses
> > > > + * @DMA_BUF_COHERENT_CACHED but the importing device does not support
> > > > + * CPU cache coherency, the dma-buf import should fail.
> > > > + */
> > > > +enum dma_buf_map_info {
> > > > +     /**
> > > > +      * @DMA_BUF_MAP_INCOHERENT: CPU mapping is incoherent.
> > > > +      *
> > > > +      * Use of DMA_BUF_IOCTL_SYNC is required for CPU managed coherenency.
> > > > +      */
> > > > +     DMA_BUF_MAP_INCOHERENT,
> > > > +
> > > > +     /**
> > > > +      * @DMA_BUF_COHERENT_WC: CPU mapping is coherent but not cached.
> > > > +      *
> > > > +      * A cpu mmap'ing is coherent, and DMA_BUF_IOCTL_SYNC is not required.
> > > > +      * However fences may be still required for synchronizing access.  Ie.
> > > > +      * coherency can only be relied upon by an explicit-fencing userspace.
> > > > +      * An implicit-sync userspace must still use DMA_BUF_IOCTL_SYNC.
> > > > +      *
> > > > +      * The cpu mapping is writecombine.
> > > > +      */
> > > > +     DMA_BUF_COHERENT_WC,
> > > > +
> > > > +     /**
> > > > +      * @DMA_BUF_COHERENT_CACHED: CPU mapping is coherent and CPU cached.
> > > > +      *
> > > > +      * A cpu mmap'ing is coherent, and DMA_BUF_IOCTL_SYNC is not required.
> > > > +      * However fences may be still required for synchronizing access.  Ie.
> > > > +      * coherency can only be relied upon by an explicit-fencing userspace.
> > > > +      * An implicit-sync userspace must still use DMA_BUF_IOCTL_SYNC.
> > > > +      *
> > > > +      * The cpu mapping is cached.
> > > > +      */
> > > > +     DMA_BUF_COHERENT_CACHED,
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct dma_buf_info - Query info about the buffer.
> > > > + */
> > > > +struct dma_buf_info {
> > > > +
> > > > +#define DMA_BUF_INFO_MAP_INFO    1
> > > > +
> > > > +     /**
> > > > +      * @param: Which param to query
> > > > +      *
> > > > +      * DMA_BUF_INFO_MAP_INFO:
> > > > +      *     Returns enum dma_buf_map_info, describing the coherency and
> > > > +      *     caching of a CPU mapping of the buffer.
> > > > +      */
> > > > +     __u32 param;
> > > > +     __u32 pad;
> > > > +
> > > > +     /**
> > > > +      * @value: Return value of the query.
> > > > +      */
> > > > +     __u64 value;
> > > > +};
> > > > +
> > > >    #define DMA_BUF_BASE                'b'
> > > >    #define DMA_BUF_IOCTL_SYNC  _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
> > > > 
> > > > @@ -95,4 +161,6 @@ struct dma_buf_sync {
> > > >    #define DMA_BUF_SET_NAME_A  _IOW(DMA_BUF_BASE, 1, __u32)
> > > >    #define DMA_BUF_SET_NAME_B  _IOW(DMA_BUF_BASE, 1, __u64)
> > > > 
> > > > +#define DMA_BUF_IOCTL_INFO   _IOWR(DMA_BUF_BASE, 2, struct dma_buf_info)
> > > > +
> > > >    #endif
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list