Collaboration on standard Wayland protocol extensions

Drew DeVault sir at
Mon Mar 28 14:55:05 UTC 2016

On 2016-03-28 11:03 PM, Carsten Haitzler wrote:
> should we? is it right to create yet another rsecurity model in userspace
> "quickly" just to solve things that dont NEED solving at least at this point.

I don't think that the protocol proposed in other branches of this
thread is complex or short sighted. Can you hop on that branch and
provide feedback?

> adding watermarks can be done after encoding as another pass (encode in high
> quality). hell watermarks can just be a WINDOW (surface) on the screen. you
> don't need options. ass for audio - not too hard to do along with it. just
> offer to record an input device - and choose (input can be current mixed output
> or a mic ... or both).

You're still not grasping the scope of this. I want you to run this
command right now:

man ffmpeg-all

Just read it for a while. You're delusional if you think you can
feasibly implement all of these features in the compositor. Do you
honestly want your screen capture tool to be able to add a watermark?
How about live streaming, some people add a sort of extra UI to read off
donations and such. The scope of your screen capture tool is increasing
at an alarming rate if you intend to support all of the features
currently possible with ffmpeg. How about instead we make a simple
wayland protocol extension that we can integrate with ffmpeg and OBS and
imagemagick and so on in a single C file.

> exactly what you describe is how e works out of the box. no sscripts needed.
> requiring people write script to do their screen configuration is just wrong.
> taking the position of "well i give up and won't bother and will just make my
> users write scripts instead" iss sticking your head in the sand and not solving
> the problem. you are now asking everyone ELSE who writes a compositor to
> implement a protocol because YOU wont solve a problem that others have solved
> in a user friendly manner.

What if I want my laptop display to remain usable? Right now I'm docked
somewhere else and I actually do have this scenario - my laptop is one
of my working displays. How would I configure the difference between
these situations in your tool? What if I'm on a laptop with poorly
supported hardware (I've seen this before) where there's a limit on how
many outputs I can use at once? What if I want to write a script where I
put on a movie and it disables every output but my TV automatically? The
user is losing a lot of power here and there's no way you can satisfy
everyone's needs unless you make it programmable.

> > Base your desktop's tools on the common protocol, of course. Gnome
> > settings, KDE settings, arandr, xrandr, nvidia-settings, and so on, all
> > seem to work fine configuring your outputs with the same protocol today.
> > Yes, the protocol is meh and the implementation is a mess, but the
> > clients of that protocol aren't bad by any stretch of the imagination.
> no tools. why do it? it's built in. in order for screen config "magic" to
> work  set of metadata  attached to screens. you can set priority (screens get
> numbers from highest to lowest priority at any given time allowing behaviour
> like your "primary" screen to migrate to an external one then migrate back when
> external monitor is attached etc.) sure we can start having that metadata
> separate but then ALTERNATE TOOLS won't be able to configure it thus breaking
> the desktop environment not providing metadata and other settings associated
> with a display. this breaks functionality for users who then complain about
> things not working right AND then the compositor has to now deal with these
> "error cases" too because a foreign tool will be messing with its data/setup.

Your example has a pretty straightforward baseline - the "default"
profile. Even so, we can design the protocol to make the custom metadata
options visible to the tools, and the tools can then provide the user
with options to configure that as well.

> as above. i have seen screen configuration used and abused over the years where
> i just do not want to have a protocol for messing around with it for any
> client. give them an inch and they'll take a mile.

Let them take a mile. _I_ want a mile. Here's an old quote that I think
is always relevant:

UNIX was not designed to stop its users from doing stupid things, as
that would also stop them from doing clever things.

> and that's perfectly fine - that is your choice. do not force your choice on
> other compositors. you can implement all the protocol you want in any way you
> want for your wm's tools.

Why do we have to be disjointed? We have a common set of problems and we
should strive for a common set of solutions.

> gnome does almost everything with dbus. they love dbus. a lot of gnome is
> centred around dbus. they likely will choose dbus to do this. likely. i
> personally wouldn't choose to use dbus.

Let's not speak for Gnome. They're copied on this thread, they'll speak
for themselves.

> > primary display? What about applications that use the entire output for
> the app can simply not request to present on their "presentation" screen... or
> the user would mark their primary screen (internal on laptop maybe) AS their
> presentation screen - more metadata to be held by compositor.

Then we're back to the very thing you were criticising before - making
the applications implement some sort of switch between using a
"presentation" output and using some other kind of output. It would be a
lot less complicated if the application asked to go full screen and the
compositor said "hey, this app wants to be full screen, which output
would you like to put it on?"

> now ALL presentation tools behave the same -  you dont have to reconfigure each
> one separately and deal with the difference and lack or otherwise of features.
> it's done in 1 place - compositor, and then all apps that want to do a
> similar thing follow and work "as expected". far better than just ignoring the
> issue. you yourself already talked about extra tags/hints/whatever - this is
> one of those.

I think I'm getting at something here. Does the workflow I just
described satisfy everyone's needs for this?

> because this require clients DEFINING screen layout. wayland was specifically
> designed to HIDE THIS. if the compositor displayed a screen wrapped around a
> sphere in real life in a room - then it doesn't have rectangles... how will an
> app deal with that? what if the compositor is literally a VR world with
> surfaces wrapped around spheres and cubes - the point of wayland's design was
> to hide this info from clients completely so the compositor decides based on
> environment, not each and every client. this was a basic premise/design in
> wayland from the get go and it was a good one. letting apps break this
> abstraction breaks this design.

In practice the VAST majority of our users are going to be using one or
more rectangular displays. We shouldn't cripple what they can do for the
sake of the niche. We can support both - why do we have to hide
information about the type of outputs in use from the clients? It
doesn't make sense for an app to get fullscreened in a virtual reality
compositor, yet we still support that. Rather than shoehorning every
design to meet the least common denominator, we should be flexible.

> > No. Applications want to be full screen or they don't want to be. If
> > they want to pick a particular output, we can easily let them do so.
> i don't know about you.. but fullscreen to enlightenment means you use up ONE
> SCREEN. [snip]

I never said that fullscreen means multiple screens. No clue where
that's coming from.

> what makes sense is an app hints at the purpose of its window and opens n
> windows (surfaces). it can ask for fullscreen for each. the hints would allow
> the compositor to choose which screen the window/surface is assigned to.

Hinting doesn't and cannot capture all of the use cases. Just letting
the client say what it wants does.

> > Gnome calculator doesn't like being tiled:
> i think the problem is you are not handling min/max sizing of clients
> properly. :) you need to fix sway. gnome calculator is not sizing up its buffer
> on surface size. that is a message "i can't be bigger than this - this is my
> biggest size. deal with is". you need to deal with it. eg - pad it and make it
> sized AT the buffer size :)

This is harmful to tiling window managers in general. The window manager
arranges the windows, not the other way around. You can't have tiling
window management if you can't have the compositor tell the clients what
size to be. There's currently no metadata to tell the compositor that a
surface is strict about its geometry. Most applications handle being
given a size quite well and will rearrange/rerender itself to
compensate. Things like gnome-calcualtor are the exception, not the

> > > xdg shell should be handling these already - except dmenu. dmenu is almost a
> > > special desktop component. like a shelf/panel/bar thing.
> > 
> > dmenu isn't the only one, though, that may want to arrange itself in
> > special ways. Lemonbar and rofi also come to mind.
> all of these basically are "desktop components" ala
> taskbars/shelves/panels/whatever - i know that for e we don't want to support
> such apps. these are built in. i don't know what gnome or kde think but these
> go against their design as an integrated desktop environment. YOU need these
> because your compositor has no such feature itself. the bigger desktops don't
> need it. they MAY support it - may not. i know i don't want to. :)

Users should be free to choose the tools they want. dmenu is much more
flexible and scriptable than anything any of the DEs offer in its place
- you just pipe in a list of things and the user picks one. Don't be
fooled into thinking that whatever your DE does for a given feature is
the mecca of that feature. Like you were saying to make other points -
there are fewer contributors to each DE than you might imagine. DEs are
spread too thin to make the perfect _everything_. But some projects like
dmenu are small and singular in their focus, and maintained by one or
two people who put in a much larger amount of effort than is put in by
DE contributors on the corresponding features of that DE.

Be flexible enough for users to pick the tools they want.

> i don't know osu - but i see no reason krita needs to configure a tablet. it
> can just deal with input from it. :)
> input is very sensitive. having done this for years and watched how games like
> to turn off key repeat then leave it off when they crash... or change mouse
> accel then you find its changed everywhere and have to "fix it" etc. etc. - i'd
> be loathe to do this. give them TOO much config ability anbd it can become a
> security issue.

Let's change the tone of the input configuration discussion. I've come
around to your points about providing input configuration in general to
clients, let's not do that. I think the only issue we should worry about
for input at this point is fixing the pointer-constraints protocol to
use our new permissions model.

> > Why do those things need to be dealt with first? Sway is at a good spot
> > where I can start thinking about these sorts of things. There are
> > enough people involved to work on multiple things at once. Plus,
> > everyone thinks nvidia's design is bad and we're hopefully going to see
> > something from them that avoids vendor-specific code.
> because these imho are far more important. you might be surprised at how few
> people are involved.

These features have to get done at some point. Backlog your
implementation of these protocols if you can't work on it now.

> not so simple. with more of the ui of an app being moved INTO the border
> (titlebar etc.) this is not a simple thing to just turn it off. you then turn
> OFF necessary parts of the ui or have to push the problem out to the app to
> "fallback".

You misunderstand me. I'm not suggesting that these apps be crippled.
I'm suggesting that, during the negotiation, they _object_ to having the
server draw their decorations. Then other apps that don't care can say

> only having CSD solves all that complexity and is more efficient
> than SSD when it comes to things like assigning hw layers or avoiding copies of
> vast amounts of pixels. i was against CSD to start with too but i see their
> major benefits.

I don't want to rehash this old argument here. There's two sides to this
coin. I think everyone fully understands the other position. It's not
hard to reach a compromise on this.

> > In Wayland you create a surface, then assign it a role. Extra details
> > can go in between, or go in the call that gives it a role. Right now
> > most applications are creating their surface and then making it a shell
> > surface. The compositor can negotiate based on its own internal state
> > over whether a given output is tiled or not, or in cases like AwesomeWM,
> > whether a given workspace is tiled or not. And I don't think the
> > decision has to be final. If the window is moved to another output or
> > really if any of the circumstances change, they can renegotiate and the
> > surface can start drawing its own decorations.
> yup. but this signalling/negotiation has to exist. currently it doesnt. :)

We'll make this part of the protocols we're working on here :)

> you aren't going to talk me into implementing something that is important for
> you and not a priority for e until such a time as i'm satisfied that the other
> issues are solved. you are free to do what you want, but standardizing things
> takes a looong time and a lot of experimentation, discussion, and repeating
> this. we have resources on wayland and nothing you described is a priority for
> them. there are far more important things to do that are actual business
> requirements and so the people working need to prioritize what is such a
> requirement as opposed to what is not. resources are not infinite and free.

Like I said before, put it on your backlog. I'm doing it now, and I want
your input on it. Provide feedback now and implement later if you need
to, but if you don't then the protocols won't meet your needs.

> let me complicate it for you. let's say i'm playing a video fullscreen. you now
> have to convert argb to yuv then encode when it would have been far more
> efficient to get access directly to the yuv buffer before it was even scaled to
> screen size... :) so you have just specified a protocol that is by design
> inefficient when it could be more efficient.

What, do you expect to tell libavcodec to switch pixel formats
mid-recording? No one is recording their screen all the time. Yeah, you
might hit performance issues. So be it. It may not be ideal but it'll
likely be well within the limits of reason.

> yes - but when, how often and via what mechanisms pixels get there is a very
> delicate thing.

And yet you still need to convert the entire screen to a frame and feed
it into an encoder, no matter what. Feed the frame to a client instead.

> so far we don't exactly have a lot of inter-desktop co-operation happening.
> it's pretty much everyone for themselves except for a smallish core protocol.

Which is ridiculous.

> do NOT try and solve security sensitive AND performance sensitive AND design
> limiting/dictating things first and definitely don't do it without everyone on
> the same page.

I'm here to get everyone on the same page. Get on it.

Drew DeVault

More information about the wayland-devel mailing list