[RFC PATCH 2/2] vc4: introduce DMA-BUF heap

Dave Stevenson dave.stevenson at raspberrypi.com
Thu Nov 9 15:42:38 UTC 2023


Hi Simon and Maxime

On Thu, 9 Nov 2023 at 09:12, Maxime Ripard <mripard at kernel.org> wrote:
>
> Hi Simon,
>
> On Thu, Nov 09, 2023 at 07:45:58AM +0000, Simon Ser wrote:
> > User-space sometimes needs to allocate scanout-capable memory for
> > GPU rendering purposes. On a vc4/v3d split render/display SoC, this
> > is achieved via DRM dumb buffers: the v3d user-space driver opens
> > the primary vc4 node, allocates a DRM dumb buffer there, exports it
> > as a DMA-BUF, imports it into the v3d render node, and renders to it.
> >
> > However, DRM dumb buffers are only meant for CPU rendering, they are
> > not intended to be used for GPU rendering. Primary nodes should only
> > be used for mode-setting purposes, other programs should not attempt
> > to open it. Moreover, opening the primary node is already broken on
> > some setups: systemd grants permission to open primary nodes to
> > physically logged in users, but this breaks when the user is not
> > physically logged in (e.g. headless setup) and when the distribution
> > is using a different init (e.g. Alpine Linux uses openrc).
> >
> > We need an alternate way for v3d to allocate scanout-capable memory.
> > Leverage DMA heaps for this purpose: expose a CMA heap to user-space.
> > Preliminary testing has been done with wlroots [1].
> >
> > This is still an RFC. Open questions:
> >
> > - Does this approach make sense to y'all in general?
>
> Makes sense to me :)
>
> > - What would be a good name for the heap? "vc4" is maybe a bit naive and
> >   not precise enough. Something with "cma"? Do we need to plan a naming
> >   scheme to accomodate for multiple vc4 devices?
>
> That's a general issue though that happens with pretty much all devices
> with a separate node for modesetting and rendering, so I don't think
> addressing it only for vc4 make sense, we should make it generic.
>
> So maybe something like "scanout"?
>
> One thing we need to consider too is that the Pi5 will have multiple
> display nodes (4(?) iirc) with different capabilities, vc4 being only
> one of them. This will impact that solution too.

It does need to scale.

Pi5 adds 4 additional DRM devices (2xDSI, 1xDPI, and 1xComposite
Video), and just this last week I've been running Wayfire with TinyDRM
drivers for SPI displays and UDL (DisplayLink) outputs as well.
Presumably all of those drivers need to have appropriate hooks added
so they each expose a dma-heap to enable scanout buffers to be
allocated.

Can we add another function pointer to the struct drm_driver (or
similar) to do the allocation, and move the actual dmabuf handling
code into the core?

> > - Right now root privileges are necessary to open the heap. Should we
> >   allow everybody to open the heap by default (after all, user-space can
> >   already allocate arbitrary amounts of GPU memory)? Should we leave it
> >   up to user-space to solve this issue (via logind/seatd or a Wayland
> >   protocol or something else)?
>
> I would have expected a udev rule to handle that?
>
> > TODO:
> >
> > - Need to add !vc5 support.
>
> If by !vc5 you mean RPi0-3, then it's probably not needed there at all
> since it has a single node for both modesetting and rendering?

That is true, but potentially vc4 could be rendering for scanout via UDL or SPI.
Is it easier to always have the dma-heap allocator for every DRM card
rather than making userspace mix and match depending on whether it is
all in one vs split?

  Dave


More information about the dri-devel mailing list