Ideas on portable APIs to cheaply copy a GBM bo?

Tue Oct 10 16:53:25 UTC 2017

Hi Matt,

On 10 October 2017 at 17:12, Matt Hoosier <matt.hoosier at gmail.com> wrote:
> My organization maintains a small patch against the DRM compositor
> that causes it to register another output. This output accepts the
> usual compositor scenegraph, does the rendering down to a primary
> plane, and then funnels the resulting GBM buffer through a codepath
> that does video compression and network transmission. (Why hack this
> into the DRM compositor? Mostly because it has all the infrastructure
> for setting up GBM, which as far as I can tell ends up being pretty
> much a requirement to get access to the composited scene as a dmabuf.)

That more or less makes sense at the moment, though there has been
quite a bit of work on less insane remoting within Weston. And then
GNOME is using PipeWire for this. But anyway, you probably know this
and it's not your immediate question.

> There's a small bit of trouble involved in handing off the dmabuf of
> the FB completed, composited scene to the compress/transmit code. That
> stuff runs asynchronously from the Weston event loop by virtue of
> living in GStreamer. The means that it's technically prone to tearing
> if the compositor gets around to flipping the back/front buffer sooner
> than the GStreamer compress/transmit stuff finishes accessing the GBM
> bo's dmabuf.
>
> One possible way to remove this race would be to use GL to take a
> private copy of the rendered primary plane. That's fairly expensive
> though, so it would be nice to avoid if at all possible.

Yeah, that's more or less out of the question.

> Another option is to enforce a synchronous handshake between the
> Weston foreground loop and the compressor/transmit asynchronous code.
> The idea would be to (1) suck out the primary plane GBM bo's dmabuf,
> (2) wait for the async stuff to work and then signal the completion of
> its usage on the BO, and then (3) release the BO locked in the first
> step. This has some pretty bad stalling implications though -- the
> async code can at times take nearly a full frame to run. The spillover
> into the next vblank period would basically force this scheme to run
> at half the normal framerate even though better interleaved use of the
> hardware can do much better.

Well, when you say back/front ... it can be more than just two. By
default, Mesa's GBM implementation should quad-buffer if you sit on
unreleased buffers for long enough. Have you tried it out? That's
definitely what I'd recommend, anyway: all the other options are, as
you've noted, bad.

> Does the readership here on wayland-devel know of any DRM-centric API
> (I looked and nothing came to mind) for leveraging a basic cheap blit
> from one DRM fb into another?

In a lot of cores, you don't get particularly easy access to blit
engines as they're more general-purpose these days. There also isn't a
generalsed API at all, even for the more fixed-function hardware.

Cheers,
Daniel