[PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API
Daniel Vetter
daniel at ffwll.ch
Wed Mar 30 09:19:23 UTC 2022
On Tue, Mar 29, 2022 at 06:16:56PM +0100, Paul Cercueil wrote:
> Hi Daniel,
>
> Le mar., mars 29 2022 at 16:10:44 +0200, Daniel Vetter <daniel at ffwll.ch> a
> écrit :
> > On Tue, Mar 29, 2022 at 10:11:14AM +0100, Paul Cercueil wrote:
> > > Hi Daniel,
> > >
> > > Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter
> > > <daniel at ffwll.ch> a
> > > écrit :
> > > > On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
> > > > > Hi Jonathan,
> > > > >
> > > > > Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron
> > > > > <jic23 at kernel.org> a écrit :
> > > > > > On Mon, 7 Feb 2022 12:59:21 +0000
> > > > > > Paul Cercueil <paul at crapouillou.net> wrote:
> > > > > >
> > > > > > > Hi Jonathan,
> > > > > > >
> > > > > > > This is the V2 of my patchset that introduces a new
> > > userspace
> > > > > > > interface
> > > > > > > based on DMABUF objects to complement the fileio API, and
> > > adds
> > > > > > > write()
> > > > > > > support to the existing fileio API.
> > > > > >
> > > > > > Hi Paul,
> > > > > >
> > > > > > It's been a little while. Perhaps you could summarize the
> > > various
> > > > > view
> > > > > > points around the appropriateness of using DMABUF for this?
> > > > > > I appreciate it is a tricky topic to distil into a brief
> > > summary
> > > > > but
> > > > > > I know I would find it useful even if no one else does!
> > > > >
> > > > > So we want to have a high-speed interface where buffers of
> > > samples
> > > > > are
> > > > > passed around between IIO devices and other devices (e.g. USB
> > > or
> > > > > network),
> > > > > or made available to userspace without copying the data.
> > > > >
> > > > > DMABUF is, at least in theory, exactly what we need. Quoting
> > > the
> > > > > documentation
> > > > >
> > > (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html):
> > > > > "The dma-buf subsystem provides the framework for sharing
> > > buffers
> > > > > for
> > > > > hardware (DMA) access across multiple device drivers and
> > > > > subsystems, and for
> > > > > synchronizing asynchronous hardware access. This is used, for
> > > > > example, by
> > > > > drm “prime” multi-GPU support, but is of course not limited to
> > > GPU
> > > > > use
> > > > > cases."
> > > > >
> > > > > The problem is that right now DMABUF is only really used by
> > > DRM,
> > > > > and to
> > > > > quote Daniel, "dma-buf looks like something super generic and
> > > > > useful, until
> > > > > you realize that there's a metric ton of gpu/accelerator bagage
> > > > > piled in".
> > > > >
> > > > > Still, it seems to be the only viable option. We could add a
> > > custom
> > > > > buffer-passing interface, but that would mean implementing the
> > > same
> > > > > buffer-passing interface on the network and USB stacks, and
> > > before
> > > > > we know
> > > > > it we re-invented DMABUFs.
> > > >
> > > > dma-buf also doesn't support sharing with network and usb stacks,
> > > so I'm
> > > > a
> > > > bit confused why exactly this is useful?
> > >
> > > There is an attempt to get dma-buf support in the network stack,
> > > called
> > > "zctap". Last patchset was sent last november. USB stack does not
> > > support
> > > dma-buf, but we can add it later I guess.
> > >
> > > > So yeah unless there's some sharing going on with gpu stuff (for
> > > data
> > > > processing maybe) I'm not sure this makes a lot of sense really.
> > > Or at
> > > > least some zero-copy sharing between drivers, but even that would
> > > > minimally require a dma-buf import ioctl of some sorts. Which I
> > > either
> > > > missed or doesn't exist.
> > >
> > > We do want zero-copy between drivers, the network stack, and the
> > > USB stack.
> > > It's not just about having a userspace interface.
> >
> > I think in that case we need these other pieces too. And we need acks
> > from
> > relevant subsystems that these other pieces are a) ready for upstream
> > merging and also that the dma-buf side of things actually makes sense.
>
> Ok...
>
> > > > If there's none of that then just hand-roll your buffer handling
> > > code
> > > > (xarray is cheap to use in terms of code for this), you can
> > > always add
> > > > dma-buf import/export later on when the need arises.
> > > >
> > > > Scrolling through patches you only have dma-buf export, but no
> > > > importing,
> > > > so the use-case that works is with one of the existing subsystems
> > > that
> > > > supporting dma-buf importing.
> > > >
> > > > I think minimally we need the use-case (in form of code) that
> > > needs the
> > > > buffer sharing here.
> > >
> > > I'll try with zctap and report back.
> >
> > Do you have a link for this? I just checked dri-devel on lore, and it's
> > not there. Nor anywhere else.
>
> The code is here: https://github.com/jlemon/zctap_kernel
>
> I know Jonathan Lemon (Cc'd) was working on upstreaming it, I saw a few
> patchsets.
Yeah if the goal here is to zero-copy from iio to network sockets, then I
think we really need the full picture first, at least as a prototype.
And also a rough consensus among all involved subsystems that this is the
right approach and that there's no fundamental issues. I really have no
clue about network to make a call there.
I'm bringing this up because a few folks wanted to look into zero-copy
between gpu and nvme, using dma-buf. And after lots of
head-banging-against-solid-concrete-walls, at least my conclusion is that
due to locking issues it's really not possible without huge changes to the
block i/o. And those are not on the table.
-Daniel
>
> Cheers,
> -Paul
>
> > We really need all the pieces, and if block layer reaction is anything
> > to
> > judge by, dma-buf wont happen for networking either. There's some really
> > nasty and fairly fundamental issues with locking and memory reclaim that
> > make this utter pain or outright impossible.
> > -Daniel
> >
> > >
> > > Cheers,
> > > -Paul
> > >
> > > > > > >
> > > > > > > Changes since v1:
> > > > > > >
> > > > > > > - the patches that were merged in v1 have been (obviously)
> > > > > dropped
> > > > > > > from
> > > > > > > this patchset;
> > > > > > > - the patch that was setting the write-combine cache
> > > setting
> > > > > has
> > > > > > > been
> > > > > > > dropped as well, as it was simply not useful.
> > > > > > > - [01/12]:
> > > > > > > * Only remove the outgoing queue, and keep the
> > > incoming
> > > > > queue,
> > > > > > > as we
> > > > > > > want the buffer to start streaming data as soon as
> > > it is
> > > > > > > enabled.
> > > > > > > * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now
> > > > > functionally
> > > > > > > the
> > > > > > > same as IIO_BLOCK_STATE_DONE.
> > > > > > > - [02/12]:
> > > > > > > * Fix block->state not being reset in
> > > > > > > iio_dma_buffer_request_update() for output buffers.
> > > > > > > * Only update block->bytes_used once and add a comment
> > > > > about
> > > > > > > why we
> > > > > > > update it.
> > > > > > > * Add a comment about why we're setting a different
> > > state
> > > > > for
> > > > > > > output
> > > > > > > buffers in iio_dma_buffer_request_update()
> > > > > > > * Remove useless cast to bool (!!) in
> > > iio_dma_buffer_io()
> > > > > > > - [05/12]:
> > > > > > > Only allow the new IOCTLs on the buffer FD created
> > > with
> > > > > > > IIO_BUFFER_GET_FD_IOCTL().
> > > > > > > - [12/12]:
> > > > > > > * Explicitly state that the new interface is optional
> > > and
> > > > > is
> > > > > > > not implemented by all drivers.
> > > > > > > * The IOCTLs can now only be called on the buffer FD
> > > > > returned by
> > > > > > > IIO_BUFFER_GET_FD_IOCTL.
> > > > > > > * Move the page up a bit in the index since it is core
> > > > > stuff
> > > > > > > and not
> > > > > > > driver-specific.
> > > > > > >
> > > > > > > The patches not listed here have not been modified since
> > > v1.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > -Paul
> > > > > > >
> > > > > > > Alexandru Ardelean (1):
> > > > > > > iio: buffer-dma: split iio_dma_buffer_fileio_free()
> > > function
> > > > > > >
> > > > > > > Paul Cercueil (11):
> > > > > > > iio: buffer-dma: Get rid of outgoing queue
> > > > > > > iio: buffer-dma: Enable buffer write support
> > > > > > > iio: buffer-dmaengine: Support specifying buffer
> > > direction
> > > > > > > iio: buffer-dmaengine: Enable write support
> > > > > > > iio: core: Add new DMABUF interface infrastructure
> > > > > > > iio: buffer-dma: Use DMABUFs instead of custom solution
> > > > > > > iio: buffer-dma: Implement new DMABUF based userspace
> > > API
> > > > > > > iio: buffer-dmaengine: Support new DMABUF based
> > > userspace API
> > > > > > > iio: core: Add support for cyclic buffers
> > > > > > > iio: buffer-dmaengine: Add support for cyclic buffers
> > > > > > > Documentation: iio: Document high-speed DMABUF based API
> > > > > > >
> > > > > > > Documentation/driver-api/dma-buf.rst | 2 +
> > > > > > > Documentation/iio/dmabuf_api.rst | 94 +++
> > > > > > > Documentation/iio/index.rst | 2 +
> > > > > > > drivers/iio/adc/adi-axi-adc.c | 3 +-
> > > > > > > drivers/iio/buffer/industrialio-buffer-dma.c | 610
> > > > > > > ++++++++++++++----
> > > > > > > .../buffer/industrialio-buffer-dmaengine.c | 42 +-
> > > > > > > drivers/iio/industrialio-buffer.c | 60 ++
> > > > > > > include/linux/iio/buffer-dma.h | 38 +-
> > > > > > > include/linux/iio/buffer-dmaengine.h | 5 +-
> > > > > > > include/linux/iio/buffer_impl.h | 8 +
> > > > > > > include/uapi/linux/iio/buffer.h | 30 +
> > > > > > > 11 files changed, 749 insertions(+), 145 deletions(-)
> > > > > > > create mode 100644 Documentation/iio/dmabuf_api.rst
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > >
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
>
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the dri-devel
mailing list