Collaboration on standard Wayland protocol extensions

Carsten Haitzler (The Rasterman) raster at
Sun Mar 27 23:55:33 UTC 2016

On Sun, 27 Mar 2016 16:34:37 -0400 Drew DeVault <sir at> said:

> Greetings! I am the maintainer of the Sway Wayland compositor.
> It's almost the Year of Wayland on the Desktop(tm), and I have
> reached out to each of the projects this message is addressed to (GNOME,
> Kwin, and wayland-devel) to collaborate on some shared protocol
> extensions for doing a handful of common tasks such as display
> configuration and taking screenshots. Life will be much easier for
> projects like ffmpeg and imagemagick if they don't have to implement
> compositor-specific code for capturing the screen!
> I want to start by establishing the requirements for these protocols.
> Broadly speaking, I am looking to create protocols for the following
> use-cases:
> - Screen capture

i can tell you that screen capture is a security sensitive thing and likely
won't get a regular wayland protocol. it definitely won't from e. if you can
capture screen, you can screenscrape. some untrusted game you downloaded for
free can start watching your internet banking and see how much money you have
in which accounts where...

the simple solution is to build it into the wm/desktop itself as an explicit
user action (keypress, menu option etc.) and now it can't be exploited as it's
not pro grammatically available. :)

i would imagine the desktops themselves would in the end provide video capture
like they would stills.

of course you have the more nasty variety of screencapture which is "screen
sharing" where you don't want to just store to a file but broadcast live. and
this then even gets worse - you would want to be able to inject events -
control the mouse, keyboard etc. from an app. this is a nasty slippery slope at
least i don't want to walk down any time soon. this is a bit of a pandoras box
of security holes to open up.

> - Output configuration

why? currently pretty much every desktop provides its OWN output configuration
tool that is part of the desktop environment. why do you want to re-invent
randr here allowing any client to mess with screen config. after YEARS of games
using xvidtune and what not to mess up screen setups this would be a horrible
idea. if you want to make a presentation tool that uses 1 screen for output and
another for "controls" then that's a matter of providing info that multiple
displays exist and what type they may be (internal, external) and clients can
tag surfaces with "intents" eg - this iss a control surface, this is an
output/display surface. compositor will then assign them appropriately.

same for games. same for media usage. etc. - there is little to no need for
clients to go messing with screen setup. this is a desktop/compositor task that
will be handled by that DE as it sees fit (some may implement a wl protocol but
only on a specific FD - maybe a socketpair to a forked child) or something dbus
or some private protocol or maybe even build it directly in to the compositor.
the same technique can be used to allow extended protocol for specific clients
too (socketpair etc.) but just don't expose at all what is not needed.

> - More detailed surface roles (should it be floating, is it a modal,
>   does it want to draw its own decorations, etc)

that seems sensible and over time i can imagine this will expand.

> - Input device configuration

as above. i see no reason clients should be doing this. surface
intents/roles/whatever can deal with this. compositor may alter how an input
device works for that surface based on this.

> I think that these are the core protocols necessary for
> cross-compositor compatability and to support most existing tools for
> X11 like ffmpeg. Considering the security goals of Wayland, it will also
> likely be necessary to implement some kind of protocol for requesting
> and granting sensitive permissions to clients.
> How does this list look? What sorts of concerns do you guys have with
> respect to what features each protocol needs to support? Have I missed
> any major protocols that we'll have to work on? Once we have a good list
> of requirements I'll start writing some XML.

as above. anything apps have no business messing with i have no interest in
having any protocol for. input device config, screen setup config etc. etc. for
sure. screen capture is a nasty one and for now - no. no access. for the common
case the DE can do it. for screen sharing kind of things... you also need input
control (take over mouse and be able to control from app - or create a 2nd
mouse pointer and control that... keyboard - same, etc. etc. etc.). this is a
nasty little thing and in implementing something like this you are also forcing
compositors to work ion specific ways - eg screen capture will likely FORCE the
compositor to merge it all into a single ARGB buffer for you rather than just
assign it to hw layers. or perhaps it would require just exposing all the
layers, their config and have the client "deal with it" ? but that means the
compositor needs to expose its screen layout. do you include pointer or not?
compositor may draw ptr into the framebuffer. it may use a special hw layer.
what about if the compositor defers rendering - does a screen capture api force
the compositor to render when the client wants? this can have all kinds of
nasty effects in the rendering pipeline - for use our rendering pipeline iss
not in the compositor but via the same libraries clients use so altering this
pipeline affects regular apps as well as compositor. ... can of worms :)

> --
> Drew DeVault
> _______________________________________________
> wayland-devel mailing list
> wayland-devel at

------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster at

More information about the wayland-devel mailing list