[PATCH 1/7] protocol: add linux_dmabuf extension RFCv1
Daniel Vetter
daniel at ffwll.ch
Mon Jan 19 22:41:52 PST 2015
On Thu, Jan 08, 2015 at 02:26:00PM +0200, Pekka Paalanen wrote:
> On Mon, 5 Jan 2015 11:44:49 +0100
> Daniel Vetter <daniel at ffwll.ch> wrote:
>
> > On Thu, Dec 18, 2014 at 01:45:22PM +0200, Pekka Paalanen wrote:
> > > On Thu, 18 Dec 2014 10:25:09 +0100
> > > Daniel Vetter <daniel at ffwll.ch> wrote:
> > >
> > > > On Fri, Dec 12, 2014 at 04:51:02PM -0500, Louis-Francis
> > > > Ratté-Boulianne wrote:
> > > > > From: Pekka Paalanen <pekka.paalanen at collabora.co.uk>
> > > > >
> > > > > An experimental (hence the 'z' prefix) linux_dmabuf Wayland protocol
> > > > > extension for creating dmabuf-based wl_buffers in a generic manner.
> > > > >
> > > > > This does not include proper dmabuf metadata negotiation because
> > > > > there is no way to communicate all dmabuf constraints from the
> > > > > compositor to a client before-hand. The client has to create a
> > > > > wl_buffer wrapping one or more dmabuf buffers and then listen at
> > > > > the feedback object returned to know if the operation was
> > > > > successful.
> > > > >
> > > > > Signed-off-by: Pekka Paalanen <pekka.paalanen at collabora.co.uk>
> > > > > Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb at collabora.com>
> > > >
> > > > So I have no idea about how wayland protos work, so please take that
> > > > into account ;-)
> > >
> > > Hi Daniel,
> > >
> > > very much appreciated anyway!
> > >
> > > > Generally I think we should try to follow what the drm addfb2 ioctl
> > > > does in drm, since in the end we need to be able to handle pretty
> > > > much all the same issues. Well wayland needs to solve a few more
> > > > since it also must cope with buffer layouts which can only be used
> > > > for rendering but not scanned out.
> > >
> > > Yes. The current proposal was written based on
> > > https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt
> > > which has some differences compared to addfb2, like having only 3
> > > planes instead of 4.
> > >
> > > > Wrt tiling layouts and compressed buffers and similar vendor specific
> > > > things: The current proposal is to add a new uint64_t layout
> > > > qualifier to addfb with opaque #defines (probably a new header), with
> > > > an 8:56 bit split between vendor identifier and opaque vendor
> > > > specified content. It will also be per-buffer (like pitches/offsets).
> > > > The abi isn't locked down yet, but definitely something to keep in
> > > > mind.
> > >
> > > Yup, I've seen it.
> > >
> > > > It will definitely complicate the protocol though since the specific
> > > > layout modifiers which are acceptable change dynamically at runtime:
> > > > Some scanout formats become invalid when we run out of fifo space,
> > > > some can only be used on specific planes and ofc this all depends
> > > > upon whether gl compositing or hw planes are used to pick the right
> > > > one.
> > >
> > > From the Wayland compositor point of view, things would get awfully
> > > hard if an existing wl_buffer (handle to a dmabuf object) could
> > > suddenly become completely unusable. That is why we try to make sure
> > > during the wl_buffer creation phase (when a client initially shares the
> > > dmabuf with the compositor) that the buffer is always usable for at
> > > least compositing (i.e. as a GL texture).
> > >
> > > If the buffer is also usable for direct scanout, that's a nice bonus.
> > > It's no problem to fall back to compositing if a buffer suddenly or
> > > temporarily turns out to be not scanout-able.
> > >
> > > We just need to guarantee that the compositing path always works. I
> > > don't think we can say to a client "oops, go make a different kind of
> > > buffer, this no longer works".
> > >
> > > Do you think these expectations are realistic?
> >
> > Yeah gl should always work. The complications really are just about trying
> > to get the optimum out of the plane hardware and the limited resources it
> > has (fifo space, limits for rotation, compressed formats and whatever
> > else). I expect that long-term wayland compositors need some form of hw
> > abstraction (like hwc from android) to really get the most out of the hw -
> > occasionally the limits are really crazy. That component would need to be
> > able to pass hints to clients a la "please use this hw specific layout for
> > the next frame". But that's mostly about picking the right tiling layout
> > or compressed format. So maybe something for a further revision (once we
> > have the addfb2 extension merged into the kernel).
>
> Yes, a further revision sounds good for such hints. I don't think we
> even know what the protocol additions should look like yet, or how a
> client would use the suggestions.
>
> ...
>
> > > > > + <interface name="dmabuf_batch" version="1">
> > > > > + <description summary="a collection of dmabufs">
> > > > > + This is a collection of dmabufs, forming a single logical
> > > > > buffer.
> > > > > + Usually only one dmabuf is needed, but some multi-planar
> > > > > formats
> > > > > + may require more.
> > > > > +
> > > > > + The order of dmabufs added for this object is significant,
> > > > > and must
> > > > > + match the expectations of the format argument to
> > > > > + linux_dmabuf.create_buffer.
> > > > > + </description>
> > > > > +
> > > > > + <request name="destroy" type="destructor">
> > > > > + <description summary="delete this object, used or not">
> > > > > + </description>
> > > > > + </request>
> > > > > +
> > > > > + <request name="add">
> > > > > + <description summary="add a dmabuf to the wl_buffer">
> > > > > + Multi-planar formats may require using more than one
> > > > > + dmabuf for passing all the data for one logical buffer.
> > > > > + This request adds one dmabuf to the set in this
> > > > > dmabuf_batch. +
> > > > > + When one dmabuf has several planar channels, offset1 &
> > > > > stride1 and
> > > > > + offset2 & stride2 must be used to denote them without
> > > > > sending a
> > > > > + new add_dmabuf request which would create a new fd in the
> > > > > server
> > > > > + while still pointing at the same dmabuf.
> > > > > +
> > > > > + offset0 and stride0 must always be set. Other unused
> > > > > offsets and
> > > > > + strides must be zero.
> > > > > + </description>
> > > > > +
> > > > > + <arg name="name" type="fd"/>
> > > > > + <arg name="offset0" type="int"/>
> > > > > + <arg name="stride0" type="int"/>
> > > > > + <arg name="offset1" type="int"/>
> > > > > + <arg name="stride1" type="int"/>
> > > > > + <arg name="offset2" type="int"/>
> > > > > + <arg name="stride2" type="int"/>
> > > > > + </request>
> > > >
> > > > Hm, with addfb we just ask userspace to pass the same fb 3/2 times for
> > > > planar data that's all in the same buffer object.
> > >
> > > Yeah, I chose to handle the fds separately, if that is what you mean.
> > > The file descriptors are passed through unix socket SCM_RIGHTS
> > > messages, and libwayland deliberately also dup's them. If a client sends
> > > the same fd several times, the compositor will get different fds
> > > all pointing to the same "file".
> > >
> > > Can that be a problem for GBM, EGL, or addfb2 et al. dma_buf import?
> > > Or rather, an implementation that makes it a problem, it legal or buggy?
> >
> > buggy. But right now the worst-case scenario is that you wast some gpu
> > gart by mapping the same underlying memory multiple times on buggy
> > drivers. So I don't think you should care (and given that there's still
> > corner-cases in upstream drm drivers in this area I guess no one else does
> > care really either).
>
> Ok. This should let us make the protocol simpler as we don't need
> to maintain the exact numerical fd identities.
>
> > But as long as you don't do crazy things the drm dma-buf import will
> > notice that it's the same dma-buf and dtrt. Crazy = actively trying to
> > reimport buffers while some other thread is dropping the last drm
> > use-count for the same.
>
> I'm not sure how one could even attempt such things.
You need your dma-buf to come from some other device (e.g. v2l pipeline),
otherwise it's impossible. And if it ever becomes an issue userspace
shouldn't ever care since the problem is in the kernel really.
> > > I assumed it could be a problem, so I designed a way that maintains the
> > > numerical identities between the fds.
> > >
> > > Also, we don't have a mechanism to send "not a valid fd" for an fd
> > > field, in case there are less than 3 (4?) planes, so we need a separate
> > > request for each valid fd.
> >
> > Maybe we could share the drm helpers to compute the number of planes
> > somehow and make it part of the proto? tbh no idea whether that would help
> > you (I have no clue about wl protos after all ...).
>
> Do you mean compute the number of planes from a DRM fourcc code? I
> can't see how that would help protocol-wise. We have no such thing
> as variable arguments. Every argument listed in the XML
> specification of a message must be there exactly. If an argument is
> specified to be an fd, it must also be a valid fd at runtime AFAIK.
>
> To send 1-4 fds, one needs to either pick a message type (request)
> that specifies exactly the right number of fds, or use a single
> message type to send each fd one by one. The difference in IPC
> efficiency is negligible, I believe.
>
> I suppose one could use a single message type with 4 fds always,
> just repeating the unwanted arguments as one of the wanted fds, but
> it feels... ugly and inefficient (extra system calls to transmit
> extra fds), especially if we usually need just one fd.
>
> At least we could simplify the dmabuf_batch interface by leaving
> just one offset/stride argument pair in the "add" request. That
> would make it a lot more clear.
>
> And we should bump the limit to 4 dmabufs per wl_buffer, like you
> pointed out from AddFB2. The above simplification makes this very
> easy. In fact, we wouldn't even need to specify any max number in
> the protocol at all which is nice.
Well I was more thinking about the validation the wayland server is
supposed to do, not necessarily whether the protocol allows it. I.e. match
the input validation to what the drm core does (checks for number of
planes, that kind of stuff). Otoh I guess you could just do the addfb2
synchronously and reuse the checking the kernel does by forwarding the
-EINVAL.
I just thinking that some spec language along "the kernel's drm subsystem
is the authoritative sources for how these fourcc codes work" would be
good.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the wayland-devel
mailing list