[EARLY RFC][PATCH] dma-buf: Add dma-buf heaps framework

Mon Mar 4 14:53:42 UTC 2019

On 3/1/19 6:06 AM, Brian Starkey wrote:
> Hi Andrew,
> 
> Sorry for not managing to comment on this sooner, I've had a crazy few
> days.
> 
> As the others have said, I quite like the direction here.
> 
> On Mon, Feb 25, 2019 at 08:36:04AM -0600, Andrew F. Davis wrote:
>> This framework allows a unified userspace interface for dma-buf
>> exporters, allowing userland to allocate specific types of memory
>> for use in dma-buf sharing.
>>
>> Each heap is given its own device node, which a user can allocate
>> a dma-buf fd from using the DMA_HEAP_IOC_ALLOC.
>>
>> Signed-off-by: Andrew F. Davis <afd at ti.com>
>> ---
>>
>> Hello all,
>>
>> I had a little less time over the weekend than I thought I would to
>> clean this up more and finish the first set of user heaps, but wanted
>> to get this out anyway.
>>
>> ION in its current form assumes a lot about the memory it exports and
>> these assumptions lead to restrictions on the full range of operations
>> dma-buf's can produce. Due to this if we are to add this as an extension
>> of the core dma-buf then it should only be the user-space advertisement
>> and allocation front-end. All dma-buf exporting and operation need to be
>> placed in the control of the exporting heap. The core becomes rather
>> small, just a few hundred lines you see here. This is not to say we
>> should not provide helpers to make the heap exporters more simple, but
>> they should only be helpers and not enforced by the core framework.
>>
>> Basic use model here is an exporter (dedicated heap memory driver, CMA,
>> System, etc.) registers with the framework by providing a struct
>> dma_heap_info which gives us the needed info to export this heap to
>> userspace as a device node. Next a user will request an allocation,
>> the IOCTL is parsed and the request made to a heap provided alloc() op.
>> The heap should return the filled out struct dma_heap_buffer, including
>> exporting the buffer as a dma-buf. This dma-buf we then return to the
>> user as a fd. Buffer free is still a bit open as we need to update some
>> stats and free some memory, but the release operation is controlled by
>> the heap exporter, so some hook will have to be created.
>>
>> It all needs a bit more work, and I'm sure I'll find missing parts when
>> I add some more heaps, but hopefully this framework is simple enough that
>> it does not impede the implementation of all functionality once provided
>> by ION (shrinker, delayed free), nor any new functionality needed for
>> future heap exporting devices.
>>
>> Thanks,
>> Andrew
>>
>>  drivers/dma-buf/Kconfig       |  12 ++
>>  drivers/dma-buf/Makefile      |   1 +
>>  drivers/dma-buf/dma-heap.c    | 268 ++++++++++++++++++++++++++++++++++
>>  include/linux/dma-heap.h      |  57 ++++++++
>>  include/uapi/linux/dma-heap.h |  54 +++++++
>>  5 files changed, 392 insertions(+)
>>  create mode 100644 drivers/dma-buf/dma-heap.c
>>  create mode 100644 include/linux/dma-heap.h
>>  create mode 100644 include/uapi/linux/dma-heap.h
>>
>> diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig
>> index 2e5a0faa2cb1..30b0d7c83945 100644
>> --- a/drivers/dma-buf/Kconfig
>> +++ b/drivers/dma-buf/Kconfig
>> @@ -39,4 +39,16 @@ config UDMABUF
>>  	  A driver to let userspace turn memfd regions into dma-bufs.
>>  	  Qemu can use this to create host dmabufs for guest framebuffers.
>>  
>> +menuconfig DMABUF_HEAPS
>> +	bool "DMA-BUF Userland Memory Heaps"
>> +	depends on HAS_DMA && MMU
>> +	select GENERIC_ALLOCATOR
>> +	select DMA_SHARED_BUFFER
>> +	help
>> +	  Choose this option to enable the DMA-BUF userland memory heaps,
>> +	  this allows userspace to allocate dma-bufs that can be shared between
>> +	  drivers.
>> +
>> +source "drivers/dma-buf/heaps/Kconfig"
>> +
>>  endmenu
>> diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
>> index 0913a6ccab5a..b0332f143413 100644
>> --- a/drivers/dma-buf/Makefile
>> +++ b/drivers/dma-buf/Makefile
>> @@ -1,4 +1,5 @@
>>  obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o
>> +obj-$(CONFIG_DMABUF_HEAPS)	+= dma-heap.o
>>  obj-$(CONFIG_SYNC_FILE)		+= sync_file.o
>>  obj-$(CONFIG_SW_SYNC)		+= sw_sync.o sync_debug.o
>>  obj-$(CONFIG_UDMABUF)		+= udmabuf.o
>> diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
>> new file mode 100644
>> index 000000000000..72ed225fa892
>> --- /dev/null
>> +++ b/drivers/dma-buf/dma-heap.c
>> @@ -0,0 +1,268 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Framework for userspace DMA-BUF allocations
>> + *
>> + * Copyright (C) 2011 Google, Inc.
>> + * Copyright (C) 2019 Linaro Ltd.
>> + */
>> +
>> +#include <linux/cdev.h>
>> +#include <linux/debugfs.h>
>> +#include <linux/device.h>
>> +#include <linux/dma-buf.h>
>> +#include <linux/err.h>
>> +#include <linux/idr.h>
>> +#include <linux/list.h>
>> +#include <linux/slab.h>
>> +#include <linux/uaccess.h>
>> +
>> +#include <linux/dma-heap.h>
>> +#include <uapi/linux/dma-heap.h>
>> +
>> +#define DEVNAME "dma_heap"
>> +
>> +#define NUM_HEAP_MINORS 128
>> +static DEFINE_IDR(dma_heap_idr);
>> +static DEFINE_MUTEX(minor_lock); /* Protect idr accesses */
>> +
>> +dev_t dma_heap_devt;
>> +struct class *dma_heap_class;
>> +struct list_head dma_heap_list;
>> +struct dentry *dma_heap_debug_root;
>> +
>> +/**
>> + * struct dma_heap - represents a dmabuf heap in the system
>> + * @name:		used for debugging/device-node name
>> + * @ops:		ops struct for this heap
>> + * @minor		minor number of this heap device
>> + * @heap_devt		heap device node
>> + * @heap_cdev		heap char device
>> + * @num_of_buffers	the number of currently allocated buffers
>> + * @num_of_alloc_bytes	the number of allocated bytes
>> + * @alloc_bytes_wm	the number of allocated bytes watermark
>> + * @stat_lock		lock for heap statistics
>> + *
>> + * Represents a heap of memory from which buffers can be made.
>> + */
>> +struct dma_heap {
>> +	const char *name;
>> +	struct dma_heap_ops *ops;
>> +	unsigned int minor;
>> +	dev_t heap_devt;
>> +	struct cdev heap_cdev;
>> +
>> +	/* heap statistics */
>> +	u64 num_of_buffers;
>> +	u64 num_of_alloc_bytes;
>> +	u64 alloc_bytes_wm;
>> +	spinlock_t stat_lock;
>> +};
>> +
>> +/*
>> + * This should be called by the heap exporter when a buffers dma-buf
>> + * handle is released so the core can update the stats and release the
>> + * buffer handle memory.
>> + *
>> + * Or we could find a way to hook into the fd or struct dma_buf itself?
>> + */
>> +void dma_heap_buffer_free(struct dma_heap_buffer *buffer)
>> +{
>> +	spin_lock(&buffer->heap->stat_lock);
>> +	buffer->heap->num_of_buffers--;
>> +	buffer->heap->num_of_alloc_bytes -= buffer->size;
>> +	spin_unlock(&buffer->heap->stat_lock);
>> +
>> +	kfree(buffer);
>> +}
>> +
>> +static int dma_heap_buffer_alloc(struct dma_heap *heap, size_t len, unsigned int flags)
>> +{
>> +	struct dma_heap_buffer *buffer;
>> +	int fd, ret;
>> +
>> +	buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
>> +	if (!buffer)
>> +		return -ENOMEM;
>> +
>> +	buffer->heap = heap;
>> +	ret = heap->ops->allocate(heap, buffer, len, flags);
>> +	if (ret) {
>> +		kfree(buffer);
>> +		return ret;
>> +	}
>> +
>> +	fd = dma_buf_fd(buffer->dmabuf, O_CLOEXEC);
>> +	if (fd < 0) {
>> +		dma_buf_put(buffer->dmabuf);
>> +		return fd;
>> +	}
>> +
>> +	spin_lock(&heap->stat_lock);
>> +	heap->num_of_buffers++;
>> +	heap->num_of_alloc_bytes += buffer->size;
>> +	if (heap->num_of_alloc_bytes > heap->alloc_bytes_wm)
>> +		heap->alloc_bytes_wm = heap->num_of_alloc_bytes;
>> +	spin_unlock(&heap->stat_lock);
>> +
>> +	return fd;
>> +}
>> +
>> +static int dma_heap_open(struct inode *inode, struct file *filp)
>> +{
>> +	struct dma_heap *heap;
>> +
>> +	mutex_lock(&minor_lock);
>> +	heap = idr_find(&dma_heap_idr, iminor(inode));
>> +	mutex_unlock(&minor_lock);
>> +	if (!heap) {
>> +		pr_err("dma_heap: minor %d unknown.\n", iminor(inode));
>> +		return -ENODEV;
>> +	}
>> +
>> +	/* instance data as context */
>> +	filp->private_data = heap;
>> +	nonseekable_open(inode, filp);
>> +
>> +	return 0;
>> +}
>> +
>> +static int dma_heap_release(struct inode *inode, struct file *filp)
>> +{
>> +	filp->private_data = NULL;
>> +
>> +	return 0;
>> +}
>> +
>> +static long dma_heap_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>> +{
>> +	switch (cmd) {
>> +	case DMA_HEAP_IOC_ALLOC:
>> +	{
>> +		struct dma_heap_allocation_data heap_allocation;
>> +		struct dma_heap *heap = filp->private_data;
>> +		int fd;
>> +
>> +		if (copy_from_user(&heap_allocation, (void __user *)arg, _IOC_SIZE(cmd)))
>> +			return -EFAULT;
> 
> Am I right in thinking "cmd" comes from userspace? It might be a good
> idea to not trust _IOC_SIZE(cmd) and use our own. I'm looking at
> Documentation/ioctl/botching-up-ioctls.txt and drm_ioctl()
> 

Yes cmd is from userspace, but we are in a switch-case that has already
matched cmd == DMA_HEAP_IOC_ALLOC which is sized correctly.

>> +
>> +		if (heap_allocation.reserved0 ||
>> +		    heap_allocation.reserved1) {
>> +			pr_warn_once("dma_heap: ioctl data not valid\n");
>> +			return -EINVAL;
>> +		}
>> +
>> +		fd = dma_heap_buffer_alloc(heap, heap_allocation.len, heap_allocation.flags);
>> +		if (fd < 0)
>> +			return fd;
>> +
>> +		heap_allocation.fd = fd;
>> +
>> +		if (copy_to_user((void __user *)arg, &heap_allocation, _IOC_SIZE(cmd)))
>> +			return -EFAULT;
>> +
>> +		break;
>> +	}
>> +	default:
>> +		return -ENOTTY;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct file_operations dma_heap_fops = {
>> +	.owner          = THIS_MODULE,
>> +	.open		= dma_heap_open,
>> +	.release	= dma_heap_release,
>> +	.unlocked_ioctl = dma_heap_ioctl,
>> +#ifdef CONFIG_COMPAT
>> +	.compat_ioctl	= dma_heap_ioctl,
>> +#endif
>> +};
>> +
>> +int dma_heap_add(struct dma_heap_info *heap_info)
>> +{
>> +	struct dma_heap *heap;
>> +	struct device *dev_ret;
>> +	struct dentry *heap_root;
>> +	int ret;
>> +
>> +	if (!heap_info->name || !strcmp(heap_info->name, "")) {
>> +		pr_err("dma_heap: Cannot add heap without a name\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (!heap_info->ops || !heap_info->ops->allocate) {
>> +		pr_err("dma_heap: Cannot add heap with invalid ops struct\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	heap = kzalloc(sizeof(*heap), GFP_KERNEL);
>> +	if (!heap)
>> +		return -ENOMEM;
>> +
>> +	heap->name = heap_info->name;
>> +	memcpy(heap->ops, heap_info->ops, sizeof(*heap->ops));
>> +
>> +	/* Find unused minor number */
>> +	mutex_lock(&minor_lock);
>> +	ret = idr_alloc(&dma_heap_idr, heap, 0, NUM_HEAP_MINORS, GFP_KERNEL);
>> +	mutex_unlock(&minor_lock);
>> +	if (ret < 0) {
>> +		pr_err("dma_heap: Unable to get minor number for heap\n");
>> +		return ret;
>> +	}
>> +	heap->minor = ret;
>> +
>> +	/* Create device */
>> +	heap->heap_devt = MKDEV(MAJOR(dma_heap_devt), heap->minor);
>> +	dev_ret = device_create(dma_heap_class,
>> +				NULL,
>> +				heap->heap_devt,
>> +				NULL,
>> +				heap->name);
>> +	if (IS_ERR(dev_ret)) {
>> +		pr_err("dma_heap: Unable to create char device\n");
>> +		return PTR_ERR(dev_ret);
>> +	}
>> +
>> +	/* Add device */
>> +	cdev_init(&heap->heap_cdev, &dma_heap_fops);
>> +	ret = cdev_add(&heap->heap_cdev, dma_heap_devt, NUM_HEAP_MINORS);
>> +	if (ret < 0) {
>> +		device_destroy(dma_heap_class, heap->heap_devt);
>> +		pr_err("dma_heap: Unable to add char device\n");
>> +		return ret;
>> +	}
>> +
>> +	heap->num_of_buffers = 0;
>> +	heap->num_of_alloc_bytes = 0;
>> +	heap->alloc_bytes_wm = 0;
>> +	spin_lock_init(&heap->stat_lock);
>> +	heap_root = debugfs_create_dir(heap->name, dma_heap_debug_root);
>> +	debugfs_create_u64("num_of_buffers", 0444, heap_root, &heap->num_of_buffers);
>> +	debugfs_create_u64("num_of_alloc_bytes", 0444, heap_root, &heap->num_of_alloc_bytes);
>> +	debugfs_create_u64("alloc_bytes_wm", 0444, heap_root, &heap->alloc_bytes_wm);
> 
> I can see it being useful to be able to reset this one.
> 

Agree, looks like these will be pulled into the heaps themselves in the
next rev John is working on, so shouldn't matter here.

What we are moving to is a better (I think) ownership model. 'DMA-heaps'
only tracks 'heaps', 'heaps' track their 'buffers'. In the above we have
'DMA-heaps' tracking info on 'buffers', bypassing the 'heaps' layer, so
in the next rev will be the 'DMA-heaps' core asks the 'heaps' to report
back stats.

>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL(dma_heap_add);
>> +
>> +static int dma_heap_init(void)
>> +{
>> +	int ret;
>> +
>> +	ret = alloc_chrdev_region(&dma_heap_devt, 0, NUM_HEAP_MINORS, DEVNAME);
>> +	if (ret)
>> +		return ret;
>> +
>> +	dma_heap_class = class_create(THIS_MODULE, DEVNAME);
>> +	if (IS_ERR(dma_heap_class)) {
>> +		unregister_chrdev_region(dma_heap_devt, NUM_HEAP_MINORS);
>> +		return PTR_ERR(dma_heap_class);
>> +	}
>> +
>> +	dma_heap_debug_root = debugfs_create_dir("dma_heap", NULL);
>> +
>> +	return 0;
>> +}
>> +subsys_initcall(dma_heap_init);
>> diff --git a/include/linux/dma-heap.h b/include/linux/dma-heap.h
>> new file mode 100644
>> index 000000000000..09eab3105118
>> --- /dev/null
>> +++ b/include/linux/dma-heap.h
>> @@ -0,0 +1,57 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * DMABUF Heaps Allocation Infrastructure
>> + *
>> + * Copyright (C) 2011 Google, Inc.
>> + * Copyright (C) 2019 Linaro Ltd.
>> + */
>> +
>> +#ifndef _DMA_HEAPS_H
>> +#define _DMA_HEAPS_H
>> +
>> +#include <linux/types.h>
>> +
>> +/**
>> + * struct dma_heap_buffer - metadata for a particular buffer
>> + * @heap:		back pointer to the heap the buffer came from
>> + * @dmabuf:		backing dma-buf for this buffer
>> + * @size:		size of the buffer
>> + * @flags:		buffer specific flags
>> + */
>> +struct dma_heap_buffer {
>> +	struct dma_heap *heap;
>> +	struct dma_buf *dmabuf;
>> +	size_t size;
>> +	unsigned long flags;
>> +};
>> +
>> +/**
>> + * struct dma_heap_ops - ops to operate on a given heap
>> + * @allocate:		allocate buffer
>> + *
>> + * allocate returns 0 on success, -errno on error.
>> + */
>> +struct dma_heap_ops {
>> +	int (*allocate)(struct dma_heap *heap,
>> +			struct dma_heap_buffer *buffer,
>> +			unsigned long len,
>> +			unsigned long flags);
>> +};
>> +
>> +/**
>> + * struct dma_heap_info - holds information needed to create a DMA-heap
>> + * @ops:		ops struct for this heap
>> + * @name:		used for debugging/device-node name
>> + */
>> +struct dma_heap_info {
>> +	const char *name;
>> +	struct dma_heap_ops *ops;
>> +};
>> +
>> +/**
>> + * dma_heap_add - adds a heap to dmabuf heaps
>> + * @heap:		the heap to add
>> + */
>> +int dma_heap_add(struct dma_heap_info *heap_info);
>> +
> 
> For me, the main thing that's missing is a way to unregister a heap.
> If drivers/modules are registering heaps, then they also need to be
> able to remove them again when they go away (rmmod or whatever).
> 
> I think that starts to be quite a can of worms, as you might end up
> with buffers which outlive their allocator (that's true today as well
> afaik, but is in drivers rather than something more central).
> 

Yeah, was wanting to avoid dealing with all that, at least to start out
with. At a high level memory shouldn't ever just disappear, we should
block removing a heap driver until all buffers have been released.

>> +#endif /* _DMA_HEAPS_H */
>> diff --git a/include/uapi/linux/dma-heap.h b/include/uapi/linux/dma-heap.h
>> new file mode 100644
>> index 000000000000..7dcbb98e29ec
>> --- /dev/null
>> +++ b/include/uapi/linux/dma-heap.h
>> @@ -0,0 +1,54 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * DMABUF Heaps Userspace API
>> + *
>> + * Copyright (C) 2011 Google, Inc.
>> + * Copyright (C) 2019 Linaro Ltd.
>> + */
>> +#ifndef _UAPI_LINUX_DMABUF_POOL_H
>> +#define _UAPI_LINUX_DMABUF_POOL_H
>> +
>> +#include <linux/ioctl.h>
>> +#include <linux/types.h>
>> +
>> +/**
>> + * DOC: DMABUF Heaps Userspace API
>> + *
>> + */
>> +
>> +/*
>> + * mappings of this buffer should be un-cached, otherwise dmabuf heaps will
>> + * need cache maintenance using DMA_BUF_SYNC_* when the buffer is mapped for dma
>> + */
>> +#define DMA_HEAP_FLAG_COHERENT 1
> 
> I'm not really clear on the intention of this flag, and I do think
> that "COHERENT" is a confusing/overloaded term.
> 

It is, I wanted to use the term as used by the other DMA frameworks, but
I don't really like it either.

> To me, if the buffer is actually coherent with DMA masters, then
> there's no need for any cache maintenance - which is the opposite of
> what the comment says.
> 

Buffers themselves can't be (non)coherent, they are just memory, masters
on a bus can have coherent interactions depending on which masters are
talking and how the buffer in question is managed. So really the term
isn't right almost anywhere as it only applies from the perspective of
the local core (running Linux) and only if simply not locally caching
the buffer is enough to make it "coherent". But that is a rant best
saved for another time.

For us lets drop that flag, if you want to allocate from a
non-locally-cached buffer then it can be its own heap that only provides
that type of allocation. Or even the same heap exporter providing both
types, just registers itself twice with the DMA-heaps core, once for an
interface that gives out cached buffers and one for uncached.

Andrew

> Cheers,
> -Brian
> 
>> +
>> +/**
>> + * struct dma_heap_allocation_data - metadata passed from userspace for
>> + *                                      allocations
>> + * @len:		size of the allocation
>> + * @flags:		flags passed to pool
>> + * @fd:			will be populated with a fd which provdes the
>> + *			handle to the allocated dma-buf
>> + *
>> + * Provided by userspace as an argument to the ioctl
>> + */
>> +struct dma_heap_allocation_data {
>> +	__u64 len;
>> +	__u32 flags;
>> +	__u32 fd;
>> +	__u32 reserved0;
>> +	__u32 reserved1;
>> +};
>> +
>> +#define DMA_HEAP_IOC_MAGIC		'H'
>> +
>> +/**
>> + * DOC: DMA_HEAP_IOC_ALLOC - allocate memory from pool
>> + *
>> + * Takes an dma_heap_allocation_data struct and returns it with the fd field
>> + * populated with the dmabuf handle of the allocation.
>> + */
>> +#define DMA_HEAP_IOC_ALLOC	_IOWR(DMA_HEAP_IOC_MAGIC, 0, \
>> +				      struct dma_heap_allocation_data)
>> +
>> +#endif /* _UAPI_LINUX_DMABUF_POOL_H */
>> -- 
>> 2.19.1
>>