Collaboration on standard Wayland protocol extensions

Carsten Haitzler (The Rasterman) raster at
Tue Mar 29 06:10:10 UTC 2016

On Tue, 29 Mar 2016 00:01:00 -0400 Drew DeVault <sir at> said:

> On 2016-03-29 11:31 AM, Carsten Haitzler wrote:
> > my take on it is that it's premature and not needed at this point. in fact i
> > wouldn't implement a protocol at all. *IF* i were to allow special access,
> > i'd simply require to fork the process directly from compositor and provide
> > a socketpair fd to this process and THAT fd could have extra capabilities
> > attached to the wl protocol. i would do nothing else because as a
> > compositor i cannot be sure what i am executing. i'd hand over the choice
> > of being able to execute this tool to the user to say ok to and not just
> > blindly execute anything i like.
> I don't really understand why forking from the compositor and bringing
> along the fds really gives you much of a gain in terms of security. Can


there is no way a process can access the socket with privs (even know the
extra protocol exists) unless it is executed by the compositor. the compositor
can do whatever it deems "necessary" to ensure it executes only what is
allowed. eg - a whitelist of binary paths. i see this as a lesser chance of a

> you elaborate on how this changes things? I should also mention that I
> don't really see the sort of security goals Wayland has in mind as
> attainable until we start doing things like containerizing applications,
> in which case we can elimitate entire classes of problems from this
> design.

certain os's do this already - tizen does. we use smack labels. this is why i
care so much about application isolation and not having anything exposed to an
app that it doesn't absolutely need. :) so i am coming from the point of view
of "containering is solved - we need to not break that in wayland" :)

> > all a compositor has to do is be able to capture a video stream to a file.
> > you can ADD watermarking, sepia, and other effects later on in a video
> > editor. next you'll tell me gimp is incapable of editing image files so we
> > need programmatic access to a digital cameras ccd to implement
> > effects/watermarking etc. on photos...
> I'll remind you again that none of this supports the live streaming
> use-case.

i know - but for just capturing screencasts, adding watermarks etc. - all you
need is to store a stream - the rest can be post-processed.

> > > currently possible with ffmpeg. How about instead we make a simple
> > > wayland protocol extension that we can integrate with ffmpeg and OBS and
> > > imagemagick and so on in a single C file.
> > 
> > i'm repeating myself. there are bigger fish to fry.
> I'm repeating myself. Fry whatever fish you want and backlog this fish.
> > eh? ummm that is what happens - unless you close the lid, then internal
> > display is "disconnected".
> I'm snipping out a lot of the output configuration related stuff from
> this response. I'm not going to argue very hard for a common output
> configuration protocol. I've been trying to change gears on the output
> discussion towards a discussion around whether or not the
> fullscreen-shell protocol supports our needs and whether or how it needs
> to be updated wrt permissions. I'm going to continue to omit large parts
> of your response that I think are related to the resistance to output
> configuration, let me know if there's something important I'm dropping
> by doing so.

why do we need the fullscreen shell? that was intended for environments where
apps are only ever fullscreen from memory. xdg shell has the ability for a
window to go fullscreen (or back to normal) this should do just fine. :) sure -
let's talk about this stuff - fullscreening etc.

> > a protocol with undefined metadata is not a good protocol. it's now goes
> > blobs of data that are opaque except to specific implementations., this
> > will mean that other implementations eventually will do things like strip
> > it out or damage it as they don't know what it is nor do they care.
> It doesn't have to be undefined metadata. It can just be extensions. A
> protocol with extensions built in is a good protocol whose designers had
> foresight, kind of like the Wayland protocol we're all already making
> extensions for.

yeah - but you are creating objects (screens) with no extended data - or
modifying them. you don't have or lose the data. :) let's talk about the actual
apps surfaces and where they go - not configuration of outputs. :)

> > but it isn't the user - it's some game you download that you cannot alter
> > the code or behaviour of that then messes everything up because its creator
> > only ever had a single monitor and didn't account for those with 2 or 3.
> But it _is_ the user. Let the user configure what they want, however
> they want, and make it so that they can both do this AND deny crappy
> games the right to do it as well. This applies to the entire discussion
> broadly, not necessarily just to the output configuration bits (which I
> retract).
> > because things like output configuration i do not see as needing a common
> > protocol. in fact it's desirable to not have one at all so it cannot be
> > abused or cause trouble.
> Troublemaking software is going to continue to make trouble. Further
> news at 9. That doesn't really justify making trouble for users as well.

or just have the compositor "work" without needing scripts and users to have to
learn how to write them. :)

> > > In practice the VAST majority of our users are going to be using one or
> > > more rectangular displays. We shouldn't cripple what they can do for the
> > > sake of the niche. We can support both - why do we have to hide
> > > information about the type of outputs in use from the clients? It
> > > doesn't make sense for an app to get fullscreened in a virtual reality
> > > compositor, yet we still support that. Rather than shoehorning every
> > > design to meet the least common denominator, we should be flexible.
> > 
> > they are not crippled. that's the point. in virtual reality fullscreen makes
> > sense as a "take over thew world", not take over the output to one eye.for
> > monitors on a desktop it makes sense to take over that monitor but not
> > others. so it depends on context and the compositors job is to
> > interpret/manage/deal with that context.
> I don't really understand what you're getting at here.

apps can still be fullscreen. nothing has been crippled. just what fullscreen
MEANS is defined by context by the compositor.

> > sorry. neither in x11 nor in wayland does a wm/compositor just have the
> > freedom to resize a window to any size it likes WITHOUT CONSEQUENCES. in
> > x11 min/max size hints tell the wm the range of sizes a window can be
> > sensibly drawn/laid out with. in wayland it's communicated by buffer size.
> > if you choose to ignore this then you get to deal with the consequences as
> > in your screenshot.
> Here's gnome-calculator running on x with a tiling window manager:

that'd be the toolkit actually resizing regardless of its min/max hints - the
wayland back end is refusing to do this. the x11 back end is "dealing with it"
even though it doesn't have to. i can point at more software that when you go
beyond max or below min size looks like trash - it may have blank/garbage areas
of the window or fall over in other ways. in x11 you CANNOT hard-control your
window size. the wm can resize it to whatever and ignore your min/max hints.
in wayland the CLIENT controls buffer size and fills buffer with content before
compositor sees it. compositor cant force a buffer size on a client. x and
wayland work differently in the case where the wm decided to just go "screw you
- i'm doing this". you may want to NOT do that and respect the fact the client
has a min and max size and work with it. :)

> Here's the wayland screenshot again for comparison:
> Most apps are fine with being told what resolution to be, and they
> _need_ to be fine with this for the sake of my sanity. But I understand
> that several applications have special concerns that would prevent this

but for THEIR sanity, they are not fine with it. :)

> from making sense, and for those it's simply a matter of saying that
> they'd prefer to be floating. This is actually one of the things in the
> X ecosystem that works perfectly fine and has worked perfectly fine for
> a long time.

no. this has nothing to do with floating. this has to do with minimum and in
this case especially - maximum sizes. it has NOTHING to do with floating. you
are conflating sizing with floating because floating is how YOU HAPPEN to want
to deal with it. you COULD deal with it as i described - pad out the area or
scale retaining aspect ratio - allow user to configure the response. if i had a
small calculator on the left and something that can size up on the right i
would EXPECt a tiling wm to be smart and do:

|   |............|
|   |............|

so keep the left column the max width of all clients and the right side expands
instead. on the left i pad with black/background around the "calculator" there.
that is what i'd expect if a client can't size up. the same for min size
(sizing down) - don't force apps to be smaller than their min size. deal with it
by scrolling or scaling the bitmap or however you like - but deal with it. :)

but don't confuse min and max size with floating. expecting devs to tell you
they want to float is not going to be common as most devs wont target a tiling
wm to make you happy here. YOU should choose to float - eg if window is of a
dialog type, or perhaps if it refuses to adapt to the size given etc. you need
to come up with properties/tags/modes/intents that are common across DEs to
have them be supported commonly. floating will not be common except a SPECIAL
mode for tiling wm's. try something else. :)

> > i would not just blindly ignore such info. i'd either pad with
> > black/background and keep to the buffer size or at least scale while
> > retaining aspect ratio (and pad as needed but likely less).
> Eww.
> > interestingly now you complain about clients having EXPLICIT control and you
> > say "oh well no ... this is bad for tiling wm's" ... yet when i explain that
> > having output configuration control etc. etc. is harmful it's something that
> > SHOULD be allowed for clients... (and where the output isn't even a client
> > resource unlike the buffers that they render which is one).
> What I really want is _users_ to have control. I don't like it that
> compositors are forcing solutions on them that doesn't allow them to be
> in control of how their shit works.

they can patch their compositors if they want. if you are forcing users to
write scripts you are already forcing them to "learn to code" in a simple way.
would it not be best to try and make things work without needing scripts/custom
code per user and have features/modes/logic that "just work" ?

> > > Users should be free to choose the tools they want. dmenu is much more
> > > flexible and scriptable than anything any of the DEs offer in its place
> > 
> > that is your wm's design. that is not the design of others.
> > they want something integrated...
> okay
> >...and don't want external tools.
> Bullshit. Give them something integrated and they'll use it. However,

i was speaking of the other DE developers - not users. YOUR design does not
want integrated. others WANT integrated designs and DON'T want adhoc
non-integrated components in their desktop environment they are creating.

> there's no reason why the integrated solution and the external tools
> couldn't both exist. The users don't give a fuck about whether or not
> the external tools exist. They are apathetic about it, they don't
> actively "not want it", and their experience is in no way worsened by
> the availablility of external tools. Those who do want external tools,
> however, have a worsened experience if we design ourselves into a black
> box that no one can extend.

you need to calm down i think.

*I* do not want adhoc panels/taskbars/tools written by separate projects within
my DE because they cause more problems than they solve. been there. done that.
not going back. i learned my lesson on that years ago. for them to work you have
pagers and taskbars in them to be fully functional and unless you ALSO then bind
all this metadata for the pagers, virtual desktops and their content to a
protocol that is also universal, then its rather pointless. this then ties your
desktop to a specific design of how desktops are (eg NxM grids and only ONE of
those in an entire environment. when with enlightenment each screen has an
independent NxM grid PER SCREEN that can be switched separately.

so either i break all those 3rd party pagers or i compromise design and force
everyone into a horrible "1 desktop spans all screens and u have NxM virtual
desktops for all screens combined" which is far worse, so i abandoned
supporting the protocol (netwm).

for good historical reasons i know *I* don't want to repeat this design from
x11 with wayland. just to implement a pager or taskbar is a security hole as
you begin to expose other clients - no more isolation. you expose buffers of
their content. AND you limit your notions of a desktop/screen to those defined
by that protocol. i would not start walking down this path to begin with.

i'm warning you that you are simply repeating past mistakes by trying to go
this way.

> > > - you just pipe in a list of things and the user picks one. Don't be
> > > fooled into thinking that whatever your DE does for a given feature is
> > > the mecca of that feature. Like you were saying to make other points -
> > 
> > no - but i'm saying that this is not a COMMON feature among all DEs.
> > different ones will work differently. gnome 3's chosen design these days is
> > to put it into gnome shell via js extensions, not the gnome 2 way with a
> > separate panel process (ala dmenu). enlightenment does it internally too
> > and extend differently. my point is that what you want here is not
> > universal.
> I'm not suggesting anything radical to try and cover all of these use
> cases at once. Sway has a protocol that lets a surface indicate it wants
> to be docked somewhere, which allows for custom taskbars and things like
> dmenu and so on to exist pretty easily, and this protocol is how swaybar
> happens to be implemented. This doesn't seem very radical to me, it
> doesn't enforce anything on how each of the DEs choose to implement
> their this and that.

then keep your protocol. :) i know i have no interest in supporting it - as
above. :)

> > > there are fewer contributors to each DE than you might imagine. DEs are
> > 
> > that is exactly what i said in response to you saying that "we have all the
> > resources to do all of this" when i said we don't... :/ we don't - resources
> > are already expended elsewhere.
> We've both used this same argument from each side multiple times, it's
> getting kind of old. But I think these statements hold true:
> There aren't necessarily enough people to work on the features I'm
> proposing right now. I don't think anyone needs to implement this _right
> now_. There also aren't ever enough people to give every little feature
> of their DE the attention that leads to software that is as high quality
> as a similar project with a single focus on that one feature.

that is true. :)

> > > Be flexible enough for users to pick the tools they want.
> > 
> > a lifetime of doing wm's has taught me that this approach is not the best.
> > you end up with a limiting and complex protocol to then allow taskbars,
> > pagers and so on to be in "dmenus" of this world. this is how gnome 1.x and
> > 2.x worked. i added the support in e long ago. i learned that it was a
> > limiter in adding features as you had to conform to someone elses idea of
> > what virtual desktops are etc.
> A lifetime of using and customizing and scripting WMs that are more
> composable and configurable than e, gnome, kde, and most of the other
> Big Ones has led me to the opposite conclusion. I'm not suggesting we do
> these sorts of efforts ad nauseum. I don't think we're heading towards a
> situation where we're agreeing on the implementation of virtual
> desktops. I'm putting forth a small handful of important, core features
> that we are all going to have to support in some way or another to even
> qualify as wayland compositors and subvert X's domainance over the
> desktop.

i just think that some of the things you want should stay "within your
compositor and its extension protocols". other things i see as genuinely
globally useful. :)

> > these panels/taskbars/shelves/whatever are best being closely integrated
> > into the wm.
> You don't provide any justification for this, you just say it like it's
> gospel, and it's not. I will again remind you that not everyone wants to

considering i actually have implemented all of this over the years,
experienced the downsides and have come around to the conclusion that an
integrated environment works best ... i've done the miles. i explained above
how issues with pagers (external ones) create issues in x11 and thus they were
dropped. not to mention security concerns (that were not an issue in x11
because it's insecure by design - insecure meaning you can access any content
of any window at any time, or discover all your application window id's any
time in the window tree whenever you want - no isolation ... etc.).

> buy into a desktop environment wholesale. They may want to piece it
> together however they see fit and it's their god damn right to. Anything
> else is against the spirit of free software.

i disagree. i can't take linux and just use some bsd device drvier with it - oh
dear. that's against the spirit free software! i have to port it and
integrate it (as a kernel module). wayland is about making the things that HAVE
to be shared protocol just that. the things that don't absolutely have to be,
we don't. you are able to patch, modify and extend your de/wm, all you like -
most de's provide some way to do this. gnome today uses js. e uses loadable
modules. i am unsure about kde. :)

> > > These features have to get done at some point. Backlog your
> > > implementation of these protocols if you can't work on it now.
> > 
> > that's what i'm saying. :)
> In this case, I'm not seeing how your points about what order things
> need to be done in matters. Now is the right time for me to implement
> this in Sway. The major problems you're trying to solve are either
> non-issues or solved issues on Sway, and it makes sense to do this now.
> I'd like to do it in a way that works for everyone.

you need to solve clients that have a minx/max size without introducing the
need for a floating property. that is something entirely different. not solved.
what happens when you need to restart sway after some development? where do all
your terminals/editors/ide's, browsers/irc clients go? they vanish and you have
to re-run them?

> > > You misunderstand me. I'm not suggesting that these apps be crippled.
> > > I'm suggesting that, during the negotiation, they _object_ to having the
> > > server draw their decorations. Then other apps that don't care can say
> > > so.
> > 
> > aaah ok. so compositor adapts. then likely i would express this as a
> > "minimize your decorations" protocol from compositor to client, client to
> > compositor then responds similarly like "minimize your decorations" and
> > compositor MAY choose to not draw a shadow/titlebar etc. (or client
> > responds with "ok" and then compositor can draw all it likes around the
> > app).
> I think Jonas is on the right track here. This sort of information could
> go into xdg_*. It might not need an entire protocol to itself.

i'd lean on a revision of xdg :)

> > > I don't want to rehash this old argument here. There's two sides to this
> > > coin. I think everyone fully understands the other position. It's not
> > > hard to reach a compromise on this.
> > 
> > it's sad that we have to have this disagreement at all. :) go on. join the
> > dark side! :) we have cookies!
> Never! I want my GTK apps and my Qt apps to have the same decorations,
> dammit :) Too bad I don't have much hope for making my cursor theme
> consistent across my entire desktop...


> > > What, do you expect to tell libavcodec to switch pixel formats
> > > mid-recording? No one is recording their screen all the time. Yeah, you
> > > might hit performance issues. So be it. It may not be ideal but it'll
> > > likely be well within the limits of reason.
> > 
> > you'll appreciate what i'm getting at next time you have to do 4k ... or 8k
> > video and screencast/capture that. :) and have to do miracast... on a 1.3ghz
> > arm device :)
> I'll go back to the earlier argument of "we shouldn't cripple the
> majority for the sake of the niche". Who on Earth is going to drive an
> 8K display on a 1.3ghz ARM device anyway :P

... you might be surprised. 4k ones are already out there. ok . not 1.3ghz -
2ghz - but no way you can capture even 4k with the highest end arms unless you
avoid conversion. you keep things in yuv space and drop your bandwidth
requirements hugely. in fact you never leave yuv space and make use of the hw
layers and the video decoder decodes directly into scanout buffers. you MAY be
able to stuff the yuv buffers back into an encoder and re-encode again ... just.
but it'd be better not to decode AND encode by take the mp4/whatever stream
directly and shuffle it down the network pipe. :)

believe it or not TODAY tablets with 4k screens ship. you can buy them. they
are required to support things like miracast (mp4/h264 stream over wifi). it's
reality today. products shipping in the 100,000's and millions. :)

------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster at

More information about the wayland-devel mailing list