Wayland debugging with Qtwayland, gstreamer waylandsink, wayland-lib and Weston

Tue Mar 5 12:26:58 UTC 2024

On Mon, 4 Mar 2024 17:59:25 +0000
Terry Barnaby <terry1 at beam.ltd.uk> wrote:

> On 04/03/2024 15:50, Pekka Paalanen wrote:
> > On Mon, 4 Mar 2024 14:51:52 +0000
> > Terry Barnaby <terry1 at beam.ltd.uk> wrote:
> >  
> >> On 04/03/2024 14:14, Pekka Paalanen wrote:  
> >>> On Mon, 4 Mar 2024 13:24:56 +0000
> >>> Terry Barnaby <terry1 at beam.ltd.uk> wrote:
> >>>     
> >>>> On 04/03/2024 09:41, Pekka Paalanen wrote:  
> >>>>> On Mon, 4 Mar 2024 08:12:10 +0000
> >>>>> Terry Barnaby <terry1 at beam.ltd.uk> wrote:
> >>>>>        
> >>>>>> While I am trying to investigate my issue in the QtWayland arena via the
> >>>>>> Qt Jira Bug system, I thought I would try taking Qt out of the equation
> >>>>>> to simplify the application a bit more to try and gain some
> >>>>>> understanding of what is going on and how this should all work.
> >>>>>>
> >>>>>> So I have created a pure GStreamer/Wayland/Weston application to test
> >>>>>> out how this should work. This is at:
> >>>>>> https://portal.beam.ltd.uk/public//test022-wayland-video-example.tar.gz
> >>>>>>
> >>>>>> This tries to implement a C++ Widget style application using native
> >>>>>> Wayland. It is rough and could easily be doing things wrong wrt Wayland.
> >>>>>> However it does work to a reasonable degree.
> >>>>>>
> >>>>>> However, I appear to see the same sort of issue I see with my Qt based
> >>>>>> system in that when a subsurface of a subsurface is used, the Gstreamer
> >>>>>> video is not seen.
> >>>>>>
> >>>>>> This example normally (UseWidgetTop=0) has a top level xdg_toplevel
> >>>>>> desktop surface (Gui), a subsurface to that (Video) and then waylandsink
> >>>>>> creates a subsurface to that which it sets to de-sync mode.
> >>>>>>
> >>>>>> When this example is run with UseWidgetTop=0 the video frames from
> >>>>>> gstreamer are only shown shown when the top subsurface is manually
> >>>>>> committed with gvideo->update() every second, otherwise the video
> >>>>>> pipeline is stalled.  
> >>>>> This is intentional. From wl_subsurface specification:
> >>>>>
> >>>>>          Even if a sub-surface is in desynchronized mode, it will behave as
> >>>>>          in synchronized mode, if its parent surface behaves as in
> >>>>>          synchronized mode. This rule is applied recursively throughout the
> >>>>>          tree of surfaces. This means, that one can set a sub-surface into
> >>>>>          synchronized mode, and then assume that all its child and grand-child
> >>>>>          sub-surfaces are synchronized, too, without explicitly setting them.
> >>>>>
> >>>>> This is derived from the design decision that a wl_surface and its
> >>>>> immediate sub-surfaces form a seamlessly integrated unit that works
> >>>>> like a single wl_surface without sub-surfaces would. wl_subsurface
> >>>>> state is state in the sub-surface's parent, so that the parent controls
> >>>>> everything as if there was just a single wl_surface. If the parent sets
> >>>>> its sub-surface as desynchronized, it explicitly gives the sub-surface
> >>>>> the permission to update on screen regardless of the parent's updates.
> >>>>> When the sub-surface is in synchronized mode, the parent surface wants
> >>>>> to be updated in sync with the sub-surface in an atomic fashion.
> >>>>>
> >>>>> When your surface stack looks like:
> >>>>>
> >>>>> - main surface A, top-level, root surface (implicitly desynchronized)
> >>>>>      - sub-surface B, synchronized
> >>>>>        - sub-surface C, desynchronized
> >>>>>
> >>>>> Updates to surface C are immediately made part of surface B, because
> >>>>> surface C is in desynchronized mode. If B was the root surface, all C
> >>>>> updates would simply go through.
> >>>>>
> >>>>> However, surface B is a part of surface A, and surface B is in
> >>>>> synchronized mode. This means that the client wants surface A updates to
> >>>>> be explicit and atomic. Nothing must change on screen until A is
> >>>>> explicitly committed itself. So any update to surface B requires a
> >>>>> commit on surface A to become visible. Surface C does not get to
> >>>>> override the atomicity requirement of surface A updates.
> >>>>>
> >>>>> This has been designed so that software component A can control surface
> >>>>> A, and delegate a part of surface A to component B which happens to the
> >>>>> using a sub-surface: surface B. If surface B parts are further
> >>>>> delegated to another component C, then component A can still be sure
> >>>>> that nothing updates on surface A until it says so. Component A sets
> >>>>> surface B to synchronized to ensure that.
> >>>>>
> >>>>> That's the rationale behind the Wayland design.
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>> pq  
> >>>> Ah, thanks for the info, that may be why this is not working even in Qt
> >>>> then.
> >>>>
> >>>> This seems a dropoff in Wayland to me. If a software module wants to
> >>>> display Video into an area on the screen at its own rate, setting that
> >>>> surface to de-synced mode is no use in the general case with this
> >>>> policy.  
> >>> It is of use, if you don't have unnecessary sub-surfaces in synchronized
> >>> mode in between, or you set all those extra sub-surfaces to
> >>> desynchronized as well.  
> >> Well they may not be necessary from the Wayland perspective, but from
> >> the higher level software they are useful to modularise/separate/provide
> >> a join for the software modules especially when software modules are
> >> separate like Qt and GStreamer.  
> > Sorry to hear that.
> >  
> >>>> I would have thought that if a subsurface was explicitly set to
> >>>> de-synced mode then that would be honoured. I can't see a usage case for
> >>>> it to be ignored and its commits synchronised up the tree ?  
> >>> Resizing the window is the main use case.
> >>>
> >>> In order to resize surface A, you also need to resize and paint surface
> >>> B, and for surface B you also need to resize and paint surface C. Then
> >>> you need to guarantee that all the updates from surface C, B and A are
> >>> applied atomically on screen.
> >>>
> >>> Either you have component APIs good enough to negotiate the
> >>> stop-resize-paint-resume on your own, or if the sub-components are
> >>> free-running regardless of frame callbacks, component A can just
> >>> temporarily set surface B to synchronized, resize and reposition it,
> >>> and resume.  
> >> I would have thought that the Wayland server could/would synchronise
> >> screen updates when a higher level surface is resized/moved by itself.  
> > If the whole window is moved, yes. Clients won't observe the
> > window moving even if they wanted to.
> >
> > But a compositor cannot resize anything. Resizing always requires the
> > client to respond with the surface drawn in the new size before it can
> > actually happen. Or a whole bunch of surfaces atomically, if you use
> > sub-surfaces.  
> 
> I would have thought it better/more useful to have a Wayland API call 
> like "stopCommiting" so that an application can sort things out for this 
> and other things, providing more application control. But I really have 
> only very limited knowledge of the Wayland system. I just keep hitting 
> its restrictions.
> 

Right, Wayland does not work that way. Wayland sees any client as a
single entity, regardless of its internal composition of libraries and
others.

When Wayland delivers any event, whether it is an explicit resize event
or an input event (or maybe the client just spontaneously decides to),
that causes the client to want to resize a window, it is then up to the
client itself to make sure it resizes everything it needs to, and keeps
everything atomic so that the end user does not see glitches on screen.

Sub-surfaces' synchronous mode was needed to let clients batch the
updates of multiple surfaces into a single atomic commit. It is the
desync mode that was a non-mandatory add-on. The synchronous mode was
needed, because there was no other way to batch multiple
wl_surface.commit requests to apply simultaneously guaranteed. Without
it, if you commit surface A and then surface B, nothing will guarantee
that the compositor would not show A updated and B not on screen for a
moment.

Wayland intentionally did not include any mechanism in its design
intended for communication between a single client's internal
components. Why use a display server as an IPC middle-man for something
that should be process-internal communication. After all, Wayland is
primarily a protocol - inter-process communication.

> >  
> >> As the software components are separately developed systems it is
> >> difficult to sync between them without changing them, but may be possible.  
> > Yes, Wayland does many things differently than older toolkits
> > expected.
> >
> >
> > ...
> >  
> >>> Is Gst waylandsink API the kind that it internally creates a new
> >>> wl_surface for itself and makes it a sub-surface of the given surface,
> >>> or is there an option to tell Gst to just push frames into a given
> >>> wl_surface?
> >>>
> >>> If the former, then waylandsink is supposed to somehow give you an API
> >>> to set the sub-surface position and z-order wrt. its parent and
> >>> siblings. If the latter, you would create wl_subsurface yourself and
> >>> keep control of it to set the sub-surface position and z-order.
> >>>
> >>> Either way, the optimal result is one top-level wl_surface, with one
> >>> sub wl_surface drawn by Gst, and no surfaces in between in the
> >>> hierarchy.  
> >> Yes, the Gst waylandsink API creates a new subsurface for itself from
> >> the GUI's managed surface to separate itself from the GUI (Qt/Gnomes)
> >> surfaces. It doesn't allow you to provide a surface to directly use. I
> >> don't think it allows the surface to be moved/resized although it can
> >> display video at an offset and size as far as I know (although it may
> >> actually change the surface to do this I will have a look). It doesn't
> >> allow the z-order to be changed I think. It expects the GUI to change
> >> its surface and I guess assumes its subsurface would effectively move in
> >> z and xy position due to the GUI moving/raising/lowering its surface
> >> (the parent) in a similar manner to how X11 would have done this.  
> > Sounds like gst waylandsink is lacking z-ordering API.
> >
> > Wayland sub-surfaces are very different from X11 windows. One
> > fundamental difference is that sub-surfaces can extend beyond their
> > parent's area. Another is that sub-surfaces always have their own
> > storage (because you have to explicitly attach wl_buffers to them),
> > they cannot address the parent's storage like in X11. And more.
> >
> > X11 windows were perhaps meant for individual widgets like buttons to
> > optimise drawing and input handling. Wayland sub-surfaces are meant for
> > things that need a separate wl_buffer in order to be off-loaded to DRM
> > KMS hardware for direct scanout. It's like the opposite ends of the
> > granularity spectrum of off-loading things to the display server.  
> 
> Yes, as far as I know X11 Windows were for individual widgets as well as 
> overall application windows. When I started programming in X11, in the 
> later 80's/early 90's there was the X11 Intrinsics toolkit that did just 
> that. It nicely separated the Widgets drawing from one another 
> modularising this all down to the protocol and display server level.

Right, and that is the polar opposite of Wayland. Wayland was invented
at a time when application toolkits were basically drawing complete
pixel images of whole windows client-side and sending that image to the
display server. Hence, the deliberate design decision to not push any
client side architectural details into the display server, as there was
simply no need.

That modularising is supposed to happen inside a client toolkit.

Routing internal things through the display server and back is just
extra overhead and latency. One can communicate with in-process
libraries much more efficiently than that.

> But 
> it was inefficient especially when more 3D looking screen objects were 
> wanted (moving to Motif) and so GUI toolkits started using DRM to draw 
> to the one Window. Mind you current GUI's have gone back to the plain 
> and simple early days look again!
> 
> The concept of having a generic Window/Surface that can be in a tree 
> hierarchy is still useful though where you want to modularise software 
> and/or have separate distinct pieces of software displaying into an 
> applications GUI. It's a shame Wayland's current surface system doesn't 
> work well as a tree hierarchy for such things.

Right. Wayland was never meant to do that. I've seen people saying bad
things about XEmbed for instance.

Wayland sub-surfaces were an answer to a very specific problem: how to
leverage display hardware planes. Hardware planes are a much more
efficient way of compositing (parts of) windows than CPU or GPU
composition, but they are also much more scarce and rigid.
Hardware planes are especially useful for videos.

> >> I will try the middle desync and/or this method by managing the
> >> waylandsink surface outside of waylandsink if I can and if it doesn't
> >> mess up either Qt's or waylandsink's operations.
> >>
> >> Thanks for the input.  
> > Thanks,
> > pq  
> 
> I believe I have managed to work around this issue without having to 
> change Qt or Waylandsink API's and code, although I have only tested 
> under Fedora and not the actual embedded platform it needs to run on.
> 
> I couldn't set the QWidgets subsurface to desynced as I cannot get its 
> subsurface as far as I can see. Qt provides a method to get a QWidgets 
> wl_surface, but not is wl_subsurface as far as I can see with a brief 
> look. Its all hidden away (unless I change Qt code) and I couldn't see a 
> way of doing this from Wayland. Maybe my discussions in the Qt Jira 
> might lead to a method in the future.
> 
> I could probably modify Waylandsink to provide an API to manage its 
> subsurface, but Ideally I don't want to modify upstream code unless 
> really needed. Maybe in the long term this is the way to go although 
> fixing at the Q tlevel sounds better if possible (It would probably need 
> Qt and waylandsink mods)

I do wonder if gst waylandsink could use API improvements upstream.

> What I have done for now, baring a quick cleaner/better method, is to do 
> work at the Wayland level in my test application. Here for my Video 
> subwidget, I create a new surface/subsurface from the Qt toplevel 
> surface, set it to desynced mode and pass that to waylandsink. As I now 
> have access to the wl_subsurface wayland sink is using as its parent, I 
> can raise and lower it, position it and resize it giving me some degree 
> of control. I have had to go through all the Wayland wl_registry work to 
> get compositor, subcompositor API's to do this (As Qt does not provide 
> access to all of these). I am unsure if this method will provide the 
> ability for my video to sit behind transparent background QWidgets, but 
> I can work without that ability for now in the system I am developing, I 
> just hope I don't see other issues with this approach, 

I'm glad you found something.

> Wayland/Qt/Gstreamer has a knack of getting in your way! I might have to 
> write/modify an alternative Weston compositor to get around this Wayland 
> feature/flaw, and support simple top level surface moves etc. (which are 
> also causing problems, I already have had to add a different shell to 
> allow the application to move its windows to a separate HDMI screen) 
> although obviously that is a hack as well but baring a proper way to do 
> this at least it should work.

Yes, Wayland window management development is very much driven by
generic desktops, which have some conflicting goals with many embedded
designs. On desktops, apps simply do not have all the information to
position their windows while the compositor knows better. Outside of
desktops, the system designer can give the chosen apps the desired
behaviour before the system is deployed.

Weston has kiosk-shell, which allows you to configure which app window
should go on which output, IIRC. The basic assumption in kiosk-shell is
that any active app is fullscreen.

For other Weston uses, there is a rough idea:
https://gitlab.freedesktop.org/wayland/weston/-/issues/520

Someone might even look into that this year.

Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/wayland-devel/attachments/20240305/0250d447/attachment.sig>