[RFC wayland] System compositor protocol

Sun Aug 25 07:11:07 PDT 2013

Hi

On Fri, Aug 23, 2013 at 11:55 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> Hello All,
> I am in the process of picking back up the old idea of system compositors.
> I am not, at the moment, looking for a review of the code; simply a review
> of the concept and the proposed protocol.  If you would like to look at my
> implementation or try it out, it can be found in the system-compositor
> branch of my personal [weston fork on github][1].
>
> What follows is what I envision for system compositors. (Others may have a
> different idea which is why I'm writing this RFC email.) The basic idea
> behind a system compositor is to provide an interface for compositors whose
> only job it is to display other compositors or other stand-alone
> full-screen interfaces.  I image three primary purposes for system
> compositors:
>
> 1. As an abstraction layer.  Every time someone wants to write a RDP
>    server, VNC server, android backend, or the like the standard answer is
>    "write it as a weston plugin".  The problem with this is that, to my
>    knowledge, none of the major desktop environments (GNOME, KDE, EFL,
>    etc.) plan to run their compositor as a weston shell plugin.  This makes
>    a weston plugin a poor solution in the long term.  On the other hand, a
>    system compositor is fairly simple to implement and provides a backend
>    to any compositor that can run on top of a system compositor.

KDE planned on using weston as underlying compositor. But I doubt
GNOME or EFL want to use something like that..

> 2. Simple full-screen clients.  With standard VTs going the way of the dodo
>    bird (see also [David Herrmann's work on logind][2]), we will need a way
>    for displaying simple full-screen programs such as splashscreens without
>    every single one of them knowing how to talk DRM/KMS directly.  David
>    has proposed to use a system compositor (which he calls wlsystemc in his
>    post) so these types of programs can be written as simple wayland
>    clients.  My github repository contains the (almost trivial)
>    modifications to simple-shm that make it in to one such client.

Yay, I appreciate someone picking up this idea \o/

> 3. As a DRM/KMS backend for other compositors.  Manually dealing with
>    DRM/KMS isn't actually all that many lines and all of the big
>    compositors (mutter, Kwin, EFL, etc.) will do it themselves.  However,
>    for people who want to write their own simple compositor, it can be a
>    bit tricky to get right and raises the barrier to entry.  The protocol
>    I'm proposing is sufficiently powerful to provide most of the basic
>    multi-output support and modesetting that is needed for a simple
>    compositor.  This means that all a potential compositor writer has to do
>    is write a system compositor client and they can let someone else get
>    the KMS details right.

Manually dealing with DRM/KMS _is_ hard. Yes, the basic setup is easy,
but I hate the fact that we now end up with everyone doing it on their
own. Advanced things can get pretty complex, like GPU selection,
multi-GPU handling, GPU hotplugging, CRTC hotplugging, planes,
cursors, DRM-Master, DPMS, DRM properties, atomic modesetting,
render-nodes, .. ugh, this list can get soo long. I am a little bit
scared that EFL or Gnome get it wrong, but ok, not my deal.

> While it sounds like a big task, implementing a system compositor isn't
> that bad.  The simplest system compositor is one that can simply display
> surfaces.  In order to do that, you need to implement the following
> interfaces:
>
>  * wl_compositor
>  * wl_region
>  * wl_shm (along with wl_shm_pool and a SHM-based wl_buffer)
>  * wl_surface
>  * wl_output
>
> None of those are particularly difficult or complicated to implement.  The
> hardest is probably wl_surface and that's not terrible.
>
> For input you have two options.  The first, most obvious option is to
> implement wl_seat and its child interfaces.  Second, I have considered the
> idea of a wl_raw_input interface (only for system compositors) that simply
> provides evdev file descriptors to the client.  This would allow for raw
> input processing without having to worry about things like weston-launch.
> Also the system compositor *may* be able to handle the FD muting etc. for
> playing nice with logind. (I'm not sure on that one yet, I'd have to ask
> David.)
>
> One more note on input: I do NOT expect the system compositor to do any
> significant input processing.  Even if the child compositor is getting its
> events through wl_seat, I expect them to be about as raw as possible.  In
> particular, this is not where things like pointer acceleration should be
> handled.  Once we get a pointer_grab interface working, I expect any
> compositor that runs on top of a system compositor to attempt a pointer
> grab almost immediately and do its own pointer handling from there.
> Otherwise it is impossible for the client compositor to re-arrange the
> outputs without additional protocol.
>
> Along with the basics as described above, a system compositor could
> optionally implement additional interfaces to provide aditional
> functionality.  For example:
>
>  * A RDP system compositor could implement wl_data_device to allow for
>    drag-and-drop and clipboard sharing between host and client.
>
>  * A system compositor could expose wl_subcompositor and use subsurfaces to
>    allow the client compositor to take advangage of hardware compositing.
>    Note that this is not as simple as it sounds because hardware frequently
>    has limits on the number, size, and placement of such overlays and there
>    is currently no way to communicate that information through the
>    subsurface protocol.  Therefore, the system compositor could find itself
>    doing a full composite anyway.
>
>  * A wl_user_switcher interface could be created for login managers.  This
>    steps on logind a bit, but someone may find it useful.
>
> That is about all I have to say for now on the subject.  If you want to
> check out my current implementation you can look at my github as I said
> above.  As of right now, I have multi-monitor weston running inside of my
> weston-based system compositor.  Also, I have simple-shm modified to act as
> a simple system compositor client if finds the wl_system_compositor global.
> The only mode that works right now is "center" but I'm working on getting
> the others to work.
>
> I appreciate any questions or comments you may have on either the main
> ideas (above) or the protocol (below).

I really like having an intermediate layer for simple tasks. Some
layer that is well maintained and allows things like plymouth to run
on top of it. However, I dislike calling it system-compositor. Let me
explain why:

In a session there are several different running daemons. Each of them
provides a different service to ease device usage. This includes
pulseaudio, dbus, polkit, colord, .. and of course the
session-compositor (xserver, weston, ...). But all of these run
_inside_ the session. If properly done, they don't need any super-user
privileges. If we switch sessions (whether via logind or via VT
doesn't matter), we can simply change device-access restrictions and
the new session can start running. We handover device access from one
session to another. My work on logind doesn't change this or reinvent
it, it only tries to make it _safe_ and _reliable_.

The idea behind a system-compositor was to provide a system daemon
that runs _outside_ a session. Its sole responsibility is to control
access to graphics and input hardware. So session-compositors no
longer access hardware directly, but instead tunnel it through the
system-compositor. But this means, the system-compositor must know of
session-switches and correctly display only the session-compositor of
the active session. However, session-switching is controlled by
logind, so the system-compositor gets the session-switch notification
_after_ the session was actually switched, making this kind of racy
(but still ok!).
The bigger problem is, the system-compositor is not part of a session
so it has to be active *all the time*. You cannot have some sessions
using the system-compositor and other sessions doing it the old way.
You cannot do device-access handover from the system-compositor to a
self-hosting or legacy session. This would require ugly racy hacks and
conflict with VT!
But if the system-compositor is always active, you cannot use VTs in
text-mode. Because VTs in text-mode access graphics hardware
*directly*.

With this in mind, we ditched the idea of a system-compositor. But
that doesn't mean the idea of tunneling graphics access is wrong! In
fact, I like it a lot, but please don't call it system-compositor! A
system-compositor is what Ubuntu people are trying and RAOF just
recently made me aware that they ran in exactly the problems I just
mentioned. So please don't do that.

So back to your proposal. I'd like to see something like you did but
as a session-compositor (call it whatever you want ;)). So a session
that doesn't want to deal with DRM directly (like for instance
gdm/xdm/kdm) could avoid starting an xserver or weston and start your
session-compositor instead. It then displays it's content via the
standard wayland client APIs on this compositor. But this compositor
imho should run in a session. So it does *not* allow clients from
different sessions to connect. It is *no* system compositor. It's just
a daemon which provides graphics-access to the session. It may even
allow switching between multiple surfaces. And if you continue this
thought, you will notice that it is nothing more than weston but with
a *very* reduced wl_shell. Precisely a wl_shell that displays only one
surface at a time.

So if the protocol is well-defined, we could standardize some path
(eg., /run/user/<num>/wayland-fullscreen-<sid>.socket) which is
provided by such a session-compositor. gdm could then during startup
check whether this path exists (that is, such a compositor is running
in its session) and use it to access graphics-devices. If the API
allows multiple surfaces, a separate "system-log"-process could run in
the _same_ session and use this compositor to display the current
system-log. Your compositor could react to keypresses like alt+1 or
alt+2 (or ESC) to switch between the surfaces. That is, the gdm
session could provide the system-log just one key-press away. Both in
the same session! You don't have to run a full-blown
VT/kmscon/whatever-log for that on another session on VT12..

> Protocol follows:
> -----------------
>
> <protocol name="system_compositor">
>   <interface name="wl_system_compositor" version="1">
>     <description summary="Displays a single surface per output">
>       Displays a single surface per output.
>
>       This interface can only be bound to by one client at a time.
>     </description>
>
>     <enum name="fullscreen_method">
>       <description summary="different method to set the surface fullscreen">
>         Hints to indicate to the compositor how to deal with a conflict
>         between the dimensions of the surface and the dimensions of the
>         output. The compositor is free to ignore this parameter.
>       </description>
>       <entry name="default" value="0" summary="no preference, apply default policy"/>
>       <entry name="scale" value="1" summary="scale, preserve the surface's aspect ratio and center on output"/>
>       <entry name="driver" value="2" summary="switch output mode to the smallest mode that can fit the surface, add black borders to compensate size mismatch"/>
>       <entry name="fill" value="3" summary="no upscaling, center on output and add black borders to compensate size mismatch"/>
>     </enum>
>
>     <request name="present_surface">
>       <description summary="present surface for display">
>         This requests the system compositor to display surface on output.
>         Each client of the system compositor can have at most one surface
>         per output at any one time. Subsequent requests with the same
>         output replace the surface bound to that output.  The same surface
>         may be presented on multiple outputs.
>
>         If the output is null, the compositor will present the surface on
>         whatever display (or displays) it thinks best.  In particular, this
>         may replace any or all surfaces currently presented so it should
>         not be used in combination with placing surfaces on specific
>         outputs.
>
>         The method specifies how the surface is to be persented.  These
>         methods are identical to those in wl_shell_surface.set_fullscreen.
>       </description>
>       <arg name="surface" type="object" interface="wl_surface"/>
>       <arg name="method" type="uint"/>
>       <arg name="framerate" type="uint"/>
>       <arg name="output" type="object" interface="wl_output" allow-null="true"/>
>     </request>
>   </interface>
> </protocol>

Yepp, that's all we need. Just rename it from "system_compositor" to
"wl_fullscreen_shell" (I bet you can come up with some fancier name).
No other criticism on this proposal from my side.

Feel free to disagree ;) I am open for suggestions or criticism.
Cheers
David