[PATCH weston] xdg-shell: Make stable

Mon Aug 25 04:11:56 PDT 2014

On Fri, 22 Aug 2014 16:48:13 -0400
"Jasper St. Pierre" <jstpierre at mecheye.net> wrote:

> On Wed, Aug 6, 2014 at 9:39 AM, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> 
> > On Thu, 17 Jul 2014 17:57:45 -0400
> > "Jasper St. Pierre" <jstpierre at mecheye.net> wrote:
> >
> > Hi Jasper,
> >
> > I am not reviewing this patch now. Instead, I took xdg-shell.xml from
> > Weston master as today, and I'm giving comments on it, as you have
> > declared it practically ready for stabilization.
> >
> > In irc, I remember you mentioned a few more things you'd like to solve,
> > like size hints, but I won't specifically comment about those, unless
> > it comes up through what's below.
> >
> > I am not too familiar in how desktops work, what is the state of the
> > art on X11, or even all of the concepts, so please allow me some room
> > for inaccuracy.
> >
> > I am reading this protocol specification only, and I am not looking at
> > the implementation bits at all. I believe the protocol spec should be
> > enough for specifying everything, even though we don't have a good track
> > record with that.
> >
> > Almost all questions and comments I give should probably lead to some
> > changes or additions in the documentation answering those questions.
> >
> 
> Hey Pekka,
> 
> I'm going to answer inline here. Anything I've snipped out I agree with and
> have already fixed. I've started on a rewrite of the documentation in here.

Hi Jasper,

excellent. :-)

> > This is a single global, yes? What happens if a client binds to this
> > global multiple times, therefore creating multiple xdg_shell objects?
> > The ping/pong spec needs to define what happens there.
> >
> 
>  Ping/pongs are on xdg_shell objects, not on surfaces. Semantics of whether
> a client is "active" meaning whether all of its xdg_shell objects respond
> to pings or just one of them is not defined here. See below.
> 
> 
> > Does the compositor have to track, which xdg_* object has been created
> > from which xdg_shell object? For ping/pong purposes?
> >
> 
> Compositors might save all the xdg_shell objects they've created, and every
> minute ping every one. "Pinging a surface" may involve pinging the
> xdg_shell object that created the surface, or pinging all xdg_shell objects
> for the client. Weston implements the former, mutter currently implements
> the latter. I may eventually change mutter to do the former as well.

Is this choice of style of implementation something we should allow,
rather than specify one or the other in the spec?

If it is not in the spec, we will be seeing all kinds of behaviours in
the wild. It probably is not a problem really, because clients are
really expected to reply to pings on all xdg_shell objects they create,
regardless on how the compositor chooses to ping. The only difference
there might be is, which surfaces could be marked as "unresponsive"
when something does not work.

But I agree with you, it shouldn't matter. I just expect these
questions to arise again when more people implement this interface,
and they might be annoyed that it's not said what should happen.

> > >       Only one shell or popup surface can be associated with a given
> > >       surface.
> > >       </description>
> > >       <arg name="id" type="new_id" interface="xdg_surface"/>
> > >       <arg name="surface" type="object" interface="wl_surface"/>
> >
> > Need to define an error case for 'surface' already having a role.
> >
> 
> This has been solved in a generic way by patches on the list.

Assuming we will land the error-in-wl_surface patch. I'm not totally
convinced yet.

> > Where does a client get the serial it needs to pass here?
> 
> > How does the compositor process the serial?
> >
> 
> See the xdg_popup documentation for the explanation of these parameters. I
> stripped out all the local text in the "get_xdg_*" method and just said
> "see xdg_* documentation", so we can put this in a central place and point
> people in the right location.

Okay.

> > >       <arg name="flags" type="uint"/>
> >
> > What are these flags? Are there no flags defined yet? Would passing an
> > unknown flag cause a protocol error?
> >
> 
> Let's just remove the flags. This stemmed from wl_shell_surface, where
> somebody had a flag for transients, but none for popups. I don't expect us
> to have flags any time soon, either.

Okay. If we need any later, it would be a new request on xdg_popup, if
not an alternate request on xdg_shell.

> > This talks about unresponsive, the ping talks about alive.
> >
> 
> Are the two different?

Just inconsistent terminology that I happened to notice.

> > >       It provides requests to treat surfaces like windows, allowing to
> > set
> > >       properties like maximized, fullscreen, minimized, and to move and
> > resize
> > >       them, and associate metadata like title and app id.
> > >
> > >       On the server side the object is automatically destroyed when
> > >       the related wl_surface is destroyed.  On client side,
> > >       xdg_surface.destroy() must be called before destroying
> > >       the wl_surface object.
> >
> > This seems to conflict the description of 'destroy' request below.
> > Instead of this, we could explain what happens if the wl_surface is
> > destroyed before its xdg_surface.
> >
> 
> Protocol error. I've made this clearer in both places now.

Very good.

> > What about surface/window mapping? To get a window mapped, is it enough
> > to just create a xdg_surface and commit connect to the wl_surface? Does
> > the order of these operations matter?
> >
> 
> The way I've always viewed a surface, it's a bag of bits that has some
> existing state, like the current buffer attached, the input/opaque regions,
> and some others. Creating a role just does something with that state like
> putting it on the screen. Committing a surface shouldn't depend on the role
> of the surface, unless it's in another mode (like synchronized subsurfaces).

Ok. I was trying to ask, if a wl_surface already has committed context,
does the xdg_shell.get_xdg_surface request alone and immediately cause
the surface to be mapped? That is, which ever comes last, the
get_xdg_surface or committing non-NULL content.

> This is obviously not how reality works (cursor surfaces seem to reset
> their absolute attachment offset when calling set_pointer), but I'd like to
> consider anything else as a weird special case, and we should strive to not
> make more of those. You or Kristian can veto me here if you disagree :)

Well, the x,y of wl_surface.attach are said to define in which
directions the surface's size changes. That assumes the surface has one
or more positions. A surface that is not mapped, conceptually does not
have any position. It is unclear, if we should be maintaining an
additional coordinate system, where 0,0 might not be the upper-left
corner of the surface, but it would always be e.g. the cursor hotspot
or drag icon "hotspot".

However, I think documenting how that would work becomes complicated,
and it adds one more coordinate system to the protocol as visible to
the clients, and we have more than enough already.

I don't see the x,y in attach as persistent state themselves, but as
modifiers to existing state (position). wl_surface is said to have a
position in the protocol spec, and x,y modify that.

Also, when you are adding just one new thing, calling the one old thing
as a "special case" is not yet justified - maybe your case is actually
the special case in the future. ;-)

> So no, it doesn't matter when commit is called in comparison to giving it a
> role. Mapping the window means that it has a window role (xdg_surface, for
> example) and that it has a non-NULL buffer attached.
> 
> That said, mutter implements the semantics right now that if you create an
> xdg_surface, and then at some point commit the surface with no content
> currently attached to the surface, it's a protocol error. I don't like the
> idea of attaching a NULL buffer to unmap a window surface, but it's weird
> that the behavior of commit changes based on the existence of a role object.

Yeah, it is weird, maybe better to not do that.

Btw. did xdg_shell have another way to hide the window but still keep
all its state, so the app can re-show it in the same place where it was
hidden?

> > If the order matters, should xdg_shell.get_xdg_surface raise a protocol
> > error if the wl_surface alrady has content (i.e. a non-NULL wl_buffer
> > committed)?
> >
> 
> No.
> 
> 
> > What about the initial size of the window? Does this interface
> > automatically deliver any events the moment the protocol object has
> > been created?
> >
> 
> Thanks for reminding me about this. This is something we need to stick in
> here at the last minute: an initial configure to the client after
> get_xdg_surface should be mandated.

Let's also think about the size hints, how do they interact here, even
if we are not adding them yet.

The client creates a wl_surface, gives it content, and turns it into
xdg_surface, and sends size hints.

The compositor as a response to get_xdg_surface sends configure, and
then again as response to the size hints sends a new configure.

Would that be right? Can the client count the configure events and know
that the second configure will have its size hints taken into account?
Or is there a race, can there be more or less than two configures
before the size hints are processed?

Should the client use the old trick of issuing one wl_display_sync
after sending the size hints, and when that sync returns, it knows it
has seen the correct configure event which has size hints acknowledged?

Or would we have to require intermediate roundtrips? I hope not.

I think for generality, we could advice clients to do the old
wl_display_sync trick. In the future, we might have more xdg_surface
requests that may cause the compositor to change its mind on how to
configure, and just counting configure events will become unreliable
fast, especially after the surface has been mapped.

There is probably no need to write the trick into the spec now, but
when we add the size hints and whatnot, we probably should recommend
that trick.

> > >     <request name="destroy" type="destructor">
> > >       <description summary="remove xdg_surface interface">
> > >       The xdg_surface interface is removed from the wl_surface object
> > >       that was turned into a xdg_surface with
> > >       xdg_shell.get_xdg_surface request. The xdg_surface properties,
> > >       like maximized and fullscreen, are lost. The wl_surface loses
> > >       its role as a xdg_surface. The wl_surface is unmapped.
> > >       </description>
> >
> > Very good.
> >
> > What about, say, xdg_popup surfaces associated to this one?
> >
> 
> Yeah, we should unmap those too.

Ok.

> > I think 'set_parent' already describes what happens with xdg_surfaces
> > who had 'set_parent' called with this surface as argument.
> >
> 
> This is actually something I changed to match mutter's semantics: it's
> illegal to unmap a parent window before unmapping a child window.

Then is there a way to unmap them all in one go? Or is one never
necessary, as the surfaces are separate enough, that it won't look bad
if one goes out slightly before the other?

Yeah, it is a theoretical race in practice, but we do try to solve them
when they matter. You could use a Wayland middle-man (tracer) to cause
it to be easily visible by delaying each protocol message from the
previous by at least one refresh cycle. We try to make the compositor's
output perfect on every frame even in that case, because if there is a
glitch in the artificial case, there is a race also in production use.

> > >     </request>
> > >
> > >     <request name="set_parent">
> > >       <description summary="surface is a child of another surface">
> > >       Child surfaces are stacked above their parents, and will be
> > >       unmapped if the parent is unmapped too. They should not appear
> > >       on task bars and alt+tab.
> >
> > Any other behavioral changes compared to an xdg_surface without this
> > request called?
> >
> 
> The "taskbar / alt-tab" thing is actually wrong, I think, and stemmed from
> transient windows. mutter might hide certain kind of "child windows" (ugh,
> I hate this term in the context of subsurfaces) from appearing on the
> taskbar, but Wayland can't get those types yet. Perhaps this is an idea for
> a future addti

So what are these surfaces that have set_parent called, if some of them
should be "in the taskbar" and some not?

Is the only effect and intent of set_parent really just the stacking
order and unmapping?

Can I use set_parent to arbitrarily re-stack my xdg_surfaces wrt. any
kinds of wl_surfaces? That seems somehow wrong, is there no real
semantic purpose behind this?

Btw. what if a client creates a loop?
A.set_parent(B)
B.set_parent(C)
C.set_parent(A)
You probably want to forbid that.

> > >       </description>
> > >       <arg name="parent" type="object" interface="wl_surface"
> > allow-null="true"/>
> >
> > What kind of wl_surfaces are allowed here, what raises a protocol error?
> >
> > >     </request>
> > >
> > >     <request name="set_app_id">
> > >       <description summary="set surface class">
> >
> > Id or class? Pick one. I would kind of prefer not id, because it is
> > quite overloaded term, but if it is the term what DEs nowdays use, then
> > so be it.
> >
> 
> Old documentation, came from wl_shell_surface, which was reusing
> terminology from X11: WM_CLASS. Fixed.

Thought so. :-)

> > >       Set an id for the surface.
> > >
> > >       The app id identifies the general class of applications to which
> > >       the surface belongs.
> > >
> > >       It should be the ID that appears in the new desktop entry
> > >       specification, the interface name.
> >
> > Could we have a reference to the specification here?
> >
> > From some very old memory, I thought this would be the .desktop file
> > path, but apparently this is the... the... dbus interface name like
> > thing, right? Like "org.gnome.shotwell" or whatever.
> >
> > And the DE will use this app_id to find the .desktop file to get an
> > icon for this window, etc.?
> >
> 
> Yeah. I've updated the documentation to make this clearer.

Nice.

> > >       </description>
> > >       <arg name="app_id" type="string"/>
> >
> > Are there any cases where a string could raise a protocol error?
> >
> 
> No.
> 
> 
> > >     <request name="show_window_menu">
> > >       <description summary="show the window menu">
> > >         Clients implementing client-side decorations might want to show
> > >         a context menu when right-clicking on the decorations, giving the
> > >         user a menu that they can use to maximize or minimize the window.
> > >
> > >         This request asks the compositor to pop up such a window menu at
> > >         the given position, relative to the parent surface. There are
> > >         no guarantees as to what the window menu contains.
> > >
> > >         Your surface must have focus on the seat passed in to pop up the
> > >         window menu.
> >
> > Is any focus, e.g. pointer, really enough? I.e. just moving the cursor
> > over a window would allow it to pop this up, without needing a
> > button-down? I see for instance wl_pointer.enter carries a serial, will
> > that do?
> >
> 
> I've reworded this (and all the other places we talk about grabbing and use
> wl_seat/serial pairs) to mention that it must be in response to a user
> action. The implementation in mutter checks that it's in response to a
> pointer or touch event. The fact that it doesn't allow keyboard is probably
> a bug.

Right.

> > How will be client know, when the window menu is gone?
> >
> 
> Why does it need to know? This is just like some other compositor popup
> came up and stole focus from it. Or screen lock, or...

Yeah, maybe it doesn't. There is nothing the client could spontaneously
do that would affect or be affected by the window menu being open, is
there? Or, could something trigger e.g. minimize while the window menu
is open? Something the client should not do while the menu is open?

Hmm, who controls the cursor?

> > Oh btw. we still don't have the popup placement negotiation protocol
> > (to avoid popup going off-screen), but the draft I read a long time ago
> > assumed, that the client knows in advance the size of the popup
> > surface. We don't know the size here. I'm not sure if solving that
> > would change something here.
> >
> 
> The compositor can pivot the menu around a point, which is very likely
> going to be the current cursor position. Specifying a box would be overall
> more correct if the popup instead stemmed from a button (so it could appear
> on all sides of the box) but I don't imagine that clients will ever use
> this on a button. We could add it for completeness if you're really
> concerned.

You cannot imagine a button in the title bar for "window menu"? :-)

How about if you want to position the menu so that it won't cover the
title bar, would that be a reasonable intent?

Btw. that's an interesting idea, specifying a box around which to
pivot.

> > >     </request>
> > >
> > >     <request name="move">
> > >       <description summary="start an interactive move">
> > >       Start a pointer-driven move of the surface.
> >
> > or touch, or maybe even keyboard?
> > Could we implement keyboard-based window movement with this?
> >
> 
> Currently, both mutter and weston check to see that either the pointer or
> touch are pressed. This is designed for the "I pressed the titlebar" case.
> It could be used for a keyboard shortcut for initiating window move, but we
> implement that as a compositor keybinding. Use cases for client-initiated
> keyboard move are welcome.

I can't really judge that. I'm still mostly stuck in the 90's GUI.

But I suppose at least touch should be mentioned in the spec.

> >       This request must be used in response to a button press event.
> > >       The server may ignore move requests depending on the state of
> > >       the surface (e.g. fullscreen or maximized).
> >
> > Again, how does the client know, when the move has ended?
> >
> 
> Should it care? The compositor can move the window at any time it wants,
> without the client knowing. I don't see the value in telling the client
> when it has moved, but only sometimes.

So that the client can revert the cursor from 4-directional arrow to the
normal cursor? Or who handles the cursor? That is related to the
wl_pointer.leave/enter events, too.

If the wl_pointer does not leave when move starts, it won't enter when
the move stops, and nothing tells the client to reset the cursor?
Is there a wl_pointer.leave involved?

> > >     </request>
> > >
> > >     <enum name="resize_edge">
> > >       <description summary="edge values for resizing">
> > >       These values are used to indicate which edge of a surface
> > >       is being dragged in a resize operation. The server may
> > >       use this information to adapt its behavior, e.g. choose
> > >       an appropriate cursor image.
> >
> > And automatically reposition the surface accordingly? (Resize from
> > top/left.)
> >
> 
> Surfaces don't care about positioning, so it doesn't need to know.

The compositor certainly cares. This is not only specifications for the
clients, but also implementation guidelines for compositors.

The main question here is: Is the client itself responsible of moving
the surface by using wl_surface.attach(x,y) parameters like it used to,
or is this something the compositor is supposed to do automatically?

Certainly the surface needs to move, if it is resized from top-left,
but who moves it?

> > >       </description>
> > >       <entry name="none" value="0"/>
> > >       <entry name="top" value="1"/>
> > >       <entry name="bottom" value="2"/>
> > >       <entry name="left" value="4"/>
> > >       <entry name="top_left" value="5"/>
> > >       <entry name="bottom_left" value="6"/>
> > >       <entry name="right" value="8"/>
> > >       <entry name="top_right" value="9"/>
> > >       <entry name="bottom_right" value="10"/>
> >
> > No use for centered resize, e.g. left_right?
> >
> 
> You might recognize that the enum is actually a bitfield. Clever users
> might notice this and try to resize with a value like LEFT | RIGHT. Clever
> users get protocol errors.
> 
> If you know of a WM that uses centered resize, let me know and I'll check
> it out. I can't imagine a good UI for this.

Ok. Did I mention I'm stuck in the 90's with my GUI concepts?
I've no idea what e.g. a modern GNOME or KDE desktop supports.
Yes, that makes me a very poor person to review this, but I haven't
seen anyone else comment on this lately. :-/

> > What about touch and keyboard, not just pointer? Or any other input
> > style. Resize with keyboard might be a nice proof of concept, but with
> > this method rather than what weston-resizor does, no?
> >
> 
> That's implemented by a WM keybinding. Again, this is more to initiating
> resize dragging from decoration edges.
> 
> 
> > So with this, the compositor is handling the repositioning of the
> > surface automatically, is it not? No need for the client to fiddle with
> > wl.surface.attach arguments x,y?
> >
> 
> Yeah, we should probably mention that the surface is anchored. In fact,
> attach x/y are completely ignored when in the "resizing" state. I keep
> discovering more reasons to hate attach x/y.

Yeah, those definitely need to be written down.

> > Could also mention, that the client will be receiving configure events
> > while interactive resizing happens, and when the operation ends.
> >
> 
> Sure.
> 
> >       </description>
> > >       <entry name="maximized" value="1" summary="the surface is
> > maximized">
> > >         The surface is maximized. The window geometry specified in the
> > configure
> > >         event must be obeyed by the client.
> >
> > I suppose "the configure event" really means "the latest configure
> > event the client has received"? A client may receive many configure
> > events before it gets to repainting, right?
> >
> > >       </entry>
> > >       <entry name="fullscreen" value="2" summary="the surface is
> > fullscreen">
> > >         The surface is fullscreen. The window geometry specified in the
> > configure
> > >         event must be obeyed by the client.
> >
> > Really? So, will we rely on wl_viewport for scaling low-res apps to
> > fullscreen? No provision for automatic black borders in aspect ratio or
> > size mismatch, even if the display hardware would be able to generate
> > those for free while scanning out the client buffer, bypassing
> > compositing?
> >
> 
> > Since we have a big space for these states, I suppose we could do those
> > mismatch cases in separate and explicit state variants of fullscreen,
> > could we not?
> >
> 
> I explicitly removed this feature from the first draft of the patch simply
> to make my life easier as a compositor writer. We could add additional
> states for this, or break fullscreen into multiple states: "fullscreen +
> size_strict" or "fullscreen + size_loose" or something.

As long as you have a plan how to add these features later, that is ok.
If we need any... more below.

> I am not familiar enough with the sixteen different configurations of the
> old fullscreen system to make an informed decision about what most clients
> want. Your help and experience is very appreciated. I'm not keen to add
> back the matrix of configuration options.

Ok, let's take something that is quite likely a major use case:
fullscreen video playback.

It requires aspect-correct scaling. We can scale with wl_viewport, but
we cannot use wl_viewport to account for mismatching aspect ratio
between the output and the video. In some cases, we need some black
bars.

It would be extremely beneficial, if the client can just submit decoded
video frames as is. If the compositor is able to scale (wl_viewport,
check) and add black bars as needed, we have a good chance to create
a zero-copy video pipeline from a hw-decoder, through the client and
the compositor, to the hw-overlay.

If you require exact size for fullscreen windows, apps need to use
a sub-surface to achieve the black bars.

Ok, so this totally works, even if you require exact size. The scaling
method is then completely dictated by the client. Looks like you are
right and we do not need another way to do this, provided we can rely
on wl_subsurface and wl_viewport.

Then to the question, what do you do, when the size still does not
match?

Let's say you have some app as fullscreen, and all is fine. Then the
compositor switches the video mode permanently. The app is constantly
submitting new frames, before it sees that it needs to change the size.
The compositor is getting at least one frame at "incorrect" size after
the mode switch and sending new configures, but before the app could
react. How does this work without killing the app?

Answering my own question, I suppose the compositor will temporarily
scale the window the fill the screen, and we don't need anything
written in the protocol spec about it. Right?

I think this is one of the cases where we want "undefined behaviour"
and not a protocol error. One compositor might scale, another might
wait a bit for a correctly sized buffer before actually changing the
mode, etc.

> >       </entry>
> > >       <entry name="resizing" value="3">
> > >         The surface is being resized. The window geometry specified in
> > the
> > >         configure event is a maximum; the client cannot resize beyond it.
> > >         Clients that have aspect ratio or cell sizing configuration can
> > use
> > >         a smaller size, however.
> >
> > Oh, there is an explicit state for this, ok.
> >
> 
> This is about the interactive resizing. The comment makes me think you
> might be confused and think it's related to fullscreen.

Definitely interactive resizing, because otherwise I would have 
asked, how does the client know when resizing ends, just like for move
and window menu.

It is useful to know when interactive resizing is going on, so that the
client can optimize the buffer allocations.

> > >       The width and height arguments specify a hint to the window
> > >         about how its surface should be resized in window geometry
> > >         coordinates. The states listed in the event specify how the
> > >         width/height arguments should be interpreted.
> >
> > The window geometry coordinates are not clearly specified. Are they
> > defined by the set_window_geometry such that 0,0 here means x,y there?
> >
> 
> What this means is that if you get a configure event with a width/height of
> let's say 640x480, then the window geometry of the window should be
> 640x480. If you have 10px shadows around the edges of your window, then the
> buffer will be 660x500, and you will call set_window_geometry(10, 10, 640,
> 480).

The surface will be 660x500, not necessarily the buffer, but yes, ok.

> It doesn't mean anything about the current geometry set. It means something
> about the new window geometry you should be setting.
> 
> See the documentation of set_window_geometry for the definition of "window
> geometry coordinate space" and the rationale behind it.

As long as it is specified somewhere, and here is a reference to that.

> > >         A client should arrange a new surface, and then send a
> > >         ack_configure request with the serial sent in this configure
> > >         event before attaching a new surface.
> >
> > "Before attaching a new surface" is vague. Does this mean "immediately
> > before issuing wl_surface.commit on the associated wl_surface"?
> > One might also argue about "immediately", I think your point is that
> > the very next commit after a ack_configure applies all the changes
> > suggested by that configure. Right?
> >
> 
> It allows compositors to know where in the event stream this client is. For
> instance, the compositor might want to maximize a constantly animating
> surface. To do that, it will need to move the surface's view to the
> top-left of the screen, but in order to know when it can move the view, it
> needs to know when the proper buffer is attached so we don't see a two-step
> change of the surface moving to the top-left, then becoming maximized. The
> surface is constantly animating, so there will be extra attaches between
> the round-trips of "set_maximized", "configure" and then "attach/commit".
> 
> The compositor could use heuristics like "when I see window geometry
> equivalent to what I sent out, then assume it's the surface we're waiting
> on", but better design this into the protocol.
> 
> (Note that this isn't specific to constantly animating surfaces. When
> releasing the maximize button in weston-terminal, it first attaches a new
> surface of the maximize button going from the pressed state back to hover
> state at the small size, before it receives the configure. The compositor
> sees two attaches: small window with hover, and big maximized window.)

I do understand why this request is there, but I just want the
specification to be written clearly. The wording left quite some room
for interpretation. "The very next commit after ack_configure" is clear
and explicit, if it is correct. "Attach" should be irrelevant here, for
instance, the client might realize the change completely through
wl_viewport without a wl_surface.attach.

> > What should a client do, if it receives states it does not understand?
> >
> 
> Ignore it.
> 
> 
> > What if the compositor sends a state that demands something, and the
> > client does not obey the demand, because a) the client is broken, or b)
> > the client is not broken, but does not know the state?
> >
> 
> There was a plan for a complex handshaking protocol, but I'm tempted to say
> that clients must understand all core states (up to their surface version),
> and clients need to ignore states in extensions they don't understand.

Ok, that works.

> Handshaking for extension states is done by some other unspecified
> mechanism. In mutter/GTK+, we use the existence of a bound global from our
> private protocol (gtk_shell) to know the client knows about the state.

Maybe something like that should be written in the spec, too.

> >       <arg name="serial" type="uint"/>
> >
> > This serial is in a completely new "serial name space", isn't it?
> > Not to be confused with e.g. input serials.
> >
> 
> It's a serial. wl_display_next_serial serial. I thought a serial was a
> serial was a serial.

This serial cannot be used for, e.g., wl_pointer.set_cursor, or
xdg_shell.get_xdg_popup, right?

And getting a configure event does not invalidate the latest serial
that was valid for wl_pointer.set_cursor etc.?

The compositor has different serial variables for each thing. There may
be one global serial counter that generates all serials, but there are
many different "slots" that store the latest valid serial for each
request that needs a serial. So that creates the "namespaces" for
serials: serials provided by events A, B, and C and be used for
requests D and E, and then serials provided by events F and G can be
used for requests H, I, J. But you cannot reliably mix them,
like B -> H.

A practical example:

Compositor sends button-down with serial 5.
Compositor sends configure with serial 6.
Client receives button-down with serial 5.
Client issues wl_pointer.set_cursor using serial 5.
=> the set_cursor must not be rejected because of the serial bump by
configure

Another example:

Client receives button-up with serial 7.
Client receives configure with serial 8.
Compositor sends wl_pointer.leave and bumps input serial to 9.
Client issues get_xdg_popup with serial 7.
Client receives the wl_pointer.leave.
Client issues ack_configure with serial 8.
Compositor receives the get_xdg_popup with serial 7 and immediately
dismisses the popup, as the input serial is already greater (9+), and
the user has already continued to interact with something else.
Compositor receives ack_configure with serial 8, and knows all its
configures are ack'd for now (compare against 8, not 9+).

Well, that is how I understand it, hope I'm not too far off.

Btw. I see code like

desktop-shell/shell.c:1735:	    seat->pointer->grab_serial == serial) {
desktop-shell/shell.c:1742:		   seat->touch->grab_serial == serial) {

which uses equality and not less-than-or-equal, and I'm not sure if
that was deliberately written like that to solve some issues, or if it
was an oversight. OTOH, we have

src/input.c:1582:	if (pointer->focus_serial - serial > UINT32_MAX / 2)

src/data-device.c-714-	if (seat->selection_data_source &&
src/data-device.c:715:	    seat->selection_serial - serial < UINT32_MAX / 2)

There seem to be many different serials, some compared with equality
and some with greater-than-or-equal with wrap-around awareness, and I'm
not really sure how it all works.

> >     </event>
> > >
> > >     <request name="ack_configure">
> > >       <description summary="ack a configure event">
> > >         When a configure event is received, a client should then ack it
> > >         using the ack_configure request to ensure that the compositor
> > >         knows the client has seen the event.
> > >
> > >         By this point, the state is confirmed, and the next attach should
> > >         contain the buffer drawn for the configure event you are acking.
> >
> > Again, "attach" is vague, you probably mean wl_surface.commit.
> >
> > Is it allowed to skip acks? If a client receives configure events with
> > serials 3,4,5, can it just ack 5 directly?
> >
> 
> Yes. I thought we had this written down somewhere.
> 
> 
> > What if you ack 4, ack 5, without a commit in between?
> >
> 
> Right now, we just update the request for the new serial.
> 
> 
> > What if you ack 5, ack 3 (going backwards, i.e. client bug), would
> > there be a protocol error?
> 
> 
> No. Again, right now we just update the serial that we'll use on the
> client. Keep in mind that it could be correct if the serial has wrapped
> around.

Ah, yes, but we can also check for the wrap-around, assuming the
difference never legitimately grows near UINT32_MAX/2.

> > Or if you ack a serial the compositor has
> > not sent yet? Or ack the same serial twice?
> 
> 
> Right now, we only use this information for pending moves, so surfaces
> might move too early or too late. I don't really care what happens to
> illegal clients, but we can't really validate this properly. Is "undefined
> behavior", which perhaps includes a protocol error, OK to you?

It think it would be possible to validate it by recording the last sent
serial for each kind per client, but maybe not really worth it. So
undefined and allowing protocol error (which error?) would be ok by me.

> > >       </description>
> > >       <arg name="x" type="int"/>
> > >       <arg name="y" type="int"/>
> > >       <arg name="width" type="int"/>
> > >       <arg name="height" type="int"/>
> 
> 
> 
> > Is there a protocol error, if width or height <= 0?
> >
> 
> Let's say there is, even if there isn't one right now.

Ok.

> > >     </request>
> > >
> > >     <request name="set_maximized" />
> > >     <request name="unset_maximized" />
> >
> > These two requests are undocumented.
> >
> 
> ... and? Do we need documentation here? It would just be <description
> summary="set the window as maximized">Maximize the surface</description>.
> Documentation by reshuffling words isn't any more useful than no
> documentation.

You assume that "maximized" is a well-known term? Ok. So what happens
when a client issues either of these two requests? Does the compositor
reply something? How does the client synchronize its drawing to the
state? Are these even anyway related to the "maximized" enum value in
the configure event? Can the compositor refuse by never issuing a
configure event with the maximized state changed?

The very least these need references to the window state system
documentation, where ever that will be.

> > >     <request name="set_fullscreen">
> > >       <description summary="set the window as fullscreen on a monitor">
> > >       Make the surface fullscreen.
> > >
> > >         You can specify an output that you would prefer to be fullscreen.
> > >       If this value is NULL, it's up to the compositor to choose which
> > >         display will be used to map this surface.
> > >       </description>
> > >       <arg name="output" type="object" interface="wl_output"
> > allow-null="true"/>
> >
> > The issue I raised at the fullscreen state enum... otherwise ok.
> >
> > >     </request>
> > >     <request name="unset_fullscreen" />
> > >
> > >     <request name="set_minimized" />
> >
> > And a couple more undocumented requests. Maybe answer the FAQ "where is
> > my unset_minimized?" I.e. how do you get out of minimized, since there
> > is no obvious request.
> >
> > I would like to see an overview of how this state system works. Can a
> > client issue set_maximized at any time? What happens then? At what
> > point does the window get drawn in maximized state?
> >
> 
> Calling set_maximized requests for the compositor to make the surface
> maximized. The compositor will then send the surface a configure event
> containing the correct set of states ("maximized", "activated"), and the
> correct width/height (640x480). The client then takes this information,
> draws its new state, attaches it, calls ack_configure, and then commits.
> 
> I'm not sure where we should put this overview. Perhaps in xdg_surface?

Yes, in the interface level doc might be a good place.

> > Can the compositor choose to set arbitrary states on its own will?
> 
> 
> Yes. If the user hits the maximize keybinding, the client will get a
> configure event out of the blue telling the surface to maximize itself.

Ok, worth to note.

> > >       The popup grab continues until the window is destroyed or a mouse
> > >       button is pressed in any other clients window. A click in any of
> > >       the clients surfaces is reported as normal, however, clicks in
> > >       other clients surfaces will be discarded and trigger the callback.
> >
> > So the client will be getting input 'enter' and 'leave' events as
> > usual, but only this client?
> >
> 
> Yep. An "owner-events grab" (X11 parlance) means that the client receives
> events for all its surfaces, not just the popup. This is so that when
> navigating through traditional submenus, the user can navigate back over to
> the submenu to the left, and that will receive motion events, allowing the
> client to start the timer and dismiss the top menu popup.
> 
> I rewrote the documentation to make this and the other concerns clearer.

Ah, yes, that is a good reason to allow input on more surfaces, if
nested popups are to be realized by multiple simultanenous xdg_popups.

Would it help anything to limit input to the chain of parents only,
instead of all, possibly unrelated, surfaces of the client?

> > What if the user alt+tabs during a popup?
> >
> > When a popup is open, are any xdg_surface operations on the client's
> > other surfaces invalid? Can a client create a second popup? What
> > happens then?
> >
> 
> Submenus work by nesting xdg_popups. So, a client creates an xdg_popup B
> with parent xdg_surface A, then xdg_popup C with parent xdg_popup B, etc.
> 
> I don't know what it means to have two popups for an xdg_surface. We can
> make up some strict semantics for now. Let's say it's a protocol error to
> make more than one popup, if it's not parented to the existing topmost
> popup.

I can go with that, if you see it should work fine.

> > >       The x and y arguments specify the locations of the upper left
> > >       corner of the surface relative to the upper left corner of the
> > >       parent surface, in surface local coordinates.
> >
> > This should go with the get_xdg_popup, no?
> >
> 
> I just put "see xdg_popup" in the docs for "get_xdg_popup".

For general behaviour it is fine to refers to another place in the docs,
but arguments I would expect to be explained in the request they are a
part of. I think this particular paragraph would be just fine there.

> > >     </request>
> > >
> > >     <event name="popup_done">
> > >       <description summary="popup interaction is done">
> > >       The popup_done event is sent out when a popup grab is broken,
> > >       that is, when the users clicks a surface that doesn't belong
> > >       to the client owning the popup surface.
> >
> > "The client should destroy the xdg_popup object." I presume?
> > Will re-use of the wl_surface be allowed?
> >
> 
> You can call get_xdg_popup on the same wl_surface again, but you can't
> transform it into any other role.

Yup, we had a long discussion about that.

> > >       </description>
> > >       <arg name="serial" type="uint" summary="serial of the implicit
> > grab on the pointer"/>
> >
> > What is this serial used for?
> >
> > Oh, is it because the client might not get a input 'enter' event if the
> > input focus is already on some other surface of the client?
> >
> 
> No. It's the serial of the *implicit grab on the pointer* -- that is, the
> serial we passed to get_xdg_popup. I have no idea why it's here, actually.
> Nothing uses it. Removed it in my local branch.

Heh. :-)

> > >     </event>
> > >
> > >   </interface>
> > > </protocol>
> >
> >
> > Okay, phew. Nothing drastically broken that I can see, if not maybe that
> > missing destructor. :-)
> >
> 
> Thank you for your really good review. You really took the time and thought
> it through. I'll have a few more patches going out soon that rewrites the
> docs for xdg-shell and corrects all of the issues mentioned.

Thank you for working on it. After landing the patches I might do
another overall review like this one, see if I spot anything more.

> > Lots and lots of details to specify, though.
> >
> 
> That's the danger thing: making a spec so dense and thorough it becomes
> unreadable.

Yeah, I've heard that before.

One more note: don't write this to explicitly depend on the role spec
additions we were writing for wl_surface, because I think we cannot
land the wl_surface additions for 1.6, since the alpha is already out.
But you can still write xdg-shell to be compatible, and then it will
all fall in place in 1.7.

Thanks,
pq