[PATCH RFC wayland-protocols] unstable/linux-dmabuf: add wp_linux_dmabuf_device_hint

Tue Nov 13 18:19:29 UTC 2018

> Hi Simon,
>
> On Fri, 2018-11-02 at 18:49 +0000, Simon Ser wrote:
> > On Friday, November 2, 2018 12:30 PM, Philipp Zabel <p.zabel at pengutronix.de> wrote:
> > > > > +    <event name="primary_device">
> > > > > +      <description summary="preferred primary device">
> > > > > +        This event advertizes the primary device that the server prefers. There
> > > > > +        is exactly one primary device.
> > >
> > > Which device should this be if the scanout engine is separate from the
> > > render engine (e.g. IPU/imx-drm and GPU/etnaviv on i.MX6)
> >
> > When the surface hints are created, I expect the compositor to send the device
> > it uses for compositing as the primary device (assuming it's using only one
> > device).
>
> i.MX6 has a separate scanout device without any acceleration capabilities
> except some hardware overlay planes, and a pure GPU render device without
> any connection to the outside world. The compositor uses both devices for
> compositing and output.

But most of the time, client buffers will go through compositing. So the
primary device is still the render device.

The situation doesn't change a lot compared to wl_drm to be honest. The device
that is advertised via wl_drm will be the primary device advertised by this
protocol.

Maybe when the compositor decides to scan-out a client, it can switch the
primary device to the scan-out device. Sorry, I don't know enough about these
particular devices to say for sure.

> > > When the surface becomes fullscreen on a different GPU (meaning it becomes
> > fullscreen on an output which is managed by another GPU), I'd expect the
> > compositor to change the primary device for this surface to this other GPU.
> >
> > If the compositor uses multiple devices for compositing, it'll probably switch
> > the primary device when the surface is moved from one GPU to the other.
> >
> > I'm not sure how i.MX6 works, but: even if the same GPU is used for compositing
> > and scanout, but the compositing preferred formats are different from the
> > scanout preferred formats, the compositor can update the preferred format
> > without changing the preferred device.
> >
> > Is there an issue with this? Maybe something should be added to the protocol to
> > explain it better?
>
> It is not clear to me from the protocol description whether the primary
> device means the scanout engine or the GPU, in case they are different.
>
> What is the client process supposed to do with this fd? Is it expected
> to be able to render on this device? Or use it to allocate the optimal
> buffers?

The client is expected to allocate its buffers there. I'm not sure about
rendering.

> > > What about contiguous vs non-contiguous memory?
> > >
> > > On i.MX6QP (Vivante GC3000) we would probably want the client to always
> > > render DRM_FORMAT_MOD_VIVANTE_SUPER_TILED, because this can be directly
> > > read by both texture samplers (non-contiguous) and scanout (must be
> > > contiguous).
> > >
> > > On i.MX6Q (Vivante GC2000) we always want to use the most efficient
> > > DRM_FORMAT_MOD_VIVANTE_SPLIT_SUPER_TILED, because neither of the
> > > supported render formats can be sampled or scanned out directly.
> > > Since the compositor has to resolve into DRM_FORMAT_MOD_VIVANTE_TILED
> > > (non-contiguous) for texture sampling or DRM_FORMAT_MOD_LINEAR
> > > (contiguous) for scanout, the client buffers can always be non-
> > > contiguous.
> > >
> > > On i.MX6S (Vivante GC880) the optimal render format for texture sampling
> > > would be DRM_FORMAT_MOD_VIVANTE_TILED (non-contiguous) and for scanout
> > > DRM_FORMAT_MOD_VIVANTE_SUPER_TILED (non-contiguous) which would be
> > > resolved into DRM_FORMAT_MOD_LINEAR (contiguous) by the compositor.
> >
> > I think all of this works with Daniel's design.
> >
> > > All three could always handle DRM_FORMAT_MOD_LINEAR (contiguous) client
> > > buffers for scanout directly, but those would be suboptimal if the
> > > compositor decides to render on short notice, because the client would
> > > have already resolved into linear and then the compositor would have to
> > > resolve back into a texture sampler tiling format.
> >
> > Is the concern here that switching between scanout and compositing is
> > non-optimal until the client chooses the preferred format?
>
> My point is just that whether or not the buffer must be contiguous in
> physical memory is the essential piece of information on i.MX6QP,
> whereas the optimal tiling modifier is the same for both GPU composition
> and direct scanout cases.
>
> If the client provides non-contiguous buffers, the "optimal" tiling
> doesn't help one bit in the scanout case, as the scanout hardware can't
> read from those.

Sorry, I don't get what you mean. Can you please try to explain again?