Wayland debugging with Qtwayland, gstreamer waylandsink, wayland-lib and Weston

Fri Mar 8 14:49:58 UTC 2024

On 05/03/2024 12:26, Pekka Paalanen wrote:
> On Mon, 4 Mar 2024 17:59:25 +0000
> Terry Barnaby <terry1 at beam.ltd.uk> wrote:
>
>> On 04/03/2024 15:50, Pekka Paalanen wrote:
>>> On Mon, 4 Mar 2024 14:51:52 +0000
>>> Terry Barnaby <terry1 at beam.ltd.uk> wrote:
>>>   
>>>> On 04/03/2024 14:14, Pekka Paalanen wrote:
>>>>> On Mon, 4 Mar 2024 13:24:56 +0000
>>>>> Terry Barnaby <terry1 at beam.ltd.uk> wrote:
>>>>>      
>>>>>> On 04/03/2024 09:41, Pekka Paalanen wrote:
>>>>>>> On Mon, 4 Mar 2024 08:12:10 +0000
>>>>>>> Terry Barnaby <terry1 at beam.ltd.uk> wrote:
>>>>>>>         
>>>>>>>> While I am trying to investigate my issue in the QtWayland arena via the
>>>>>>>> Qt Jira Bug system, I thought I would try taking Qt out of the equation
>>>>>>>> to simplify the application a bit more to try and gain some
>>>>>>>> understanding of what is going on and how this should all work.
>>>>>>>>
>>>>>>>> So I have created a pure GStreamer/Wayland/Weston application to test
>>>>>>>> out how this should work. This is at:
>>>>>>>> https://portal.beam.ltd.uk/public//test022-wayland-video-example.tar.gz
>>>>>>>>
>>>>>>>> This tries to implement a C++ Widget style application using native
>>>>>>>> Wayland. It is rough and could easily be doing things wrong wrt Wayland.
>>>>>>>> However it does work to a reasonable degree.
>>>>>>>>
>>>>>>>> However, I appear to see the same sort of issue I see with my Qt based
>>>>>>>> system in that when a subsurface of a subsurface is used, the Gstreamer
>>>>>>>> video is not seen.
>>>>>>>>
>>>>>>>> This example normally (UseWidgetTop=0) has a top level xdg_toplevel
>>>>>>>> desktop surface (Gui), a subsurface to that (Video) and then waylandsink
>>>>>>>> creates a subsurface to that which it sets to de-sync mode.
>>>>>>>>
>>>>>>>> When this example is run with UseWidgetTop=0 the video frames from
>>>>>>>> gstreamer are only shown shown when the top subsurface is manually
>>>>>>>> committed with gvideo->update() every second, otherwise the video
>>>>>>>> pipeline is stalled.
>>>>>>> This is intentional. From wl_subsurface specification:
>>>>>>>
>>>>>>>           Even if a sub-surface is in desynchronized mode, it will behave as
>>>>>>>           in synchronized mode, if its parent surface behaves as in
>>>>>>>           synchronized mode. This rule is applied recursively throughout the
>>>>>>>           tree of surfaces. This means, that one can set a sub-surface into
>>>>>>>           synchronized mode, and then assume that all its child and grand-child
>>>>>>>           sub-surfaces are synchronized, too, without explicitly setting them.
>>>>>>>
>>>>>>> This is derived from the design decision that a wl_surface and its
>>>>>>> immediate sub-surfaces form a seamlessly integrated unit that works
>>>>>>> like a single wl_surface without sub-surfaces would. wl_subsurface
>>>>>>> state is state in the sub-surface's parent, so that the parent controls
>>>>>>> everything as if there was just a single wl_surface. If the parent sets
>>>>>>> its sub-surface as desynchronized, it explicitly gives the sub-surface
>>>>>>> the permission to update on screen regardless of the parent's updates.
>>>>>>> When the sub-surface is in synchronized mode, the parent surface wants
>>>>>>> to be updated in sync with the sub-surface in an atomic fashion.
>>>>>>>
>>>>>>> When your surface stack looks like:
>>>>>>>
>>>>>>> - main surface A, top-level, root surface (implicitly desynchronized)
>>>>>>>       - sub-surface B, synchronized
>>>>>>>         - sub-surface C, desynchronized
>>>>>>>
>>>>>>> Updates to surface C are immediately made part of surface B, because
>>>>>>> surface C is in desynchronized mode. If B was the root surface, all C
>>>>>>> updates would simply go through.
>>>>>>>
>>>>>>> However, surface B is a part of surface A, and surface B is in
>>>>>>> synchronized mode. This means that the client wants surface A updates to
>>>>>>> be explicit and atomic. Nothing must change on screen until A is
>>>>>>> explicitly committed itself. So any update to surface B requires a
>>>>>>> commit on surface A to become visible. Surface C does not get to
>>>>>>> override the atomicity requirement of surface A updates.
>>>>>>>
>>>>>>> This has been designed so that software component A can control surface
>>>>>>> A, and delegate a part of surface A to component B which happens to the
>>>>>>> using a sub-surface: surface B. If surface B parts are further
>>>>>>> delegated to another component C, then component A can still be sure
>>>>>>> that nothing updates on surface A until it says so. Component A sets
>>>>>>> surface B to synchronized to ensure that.
>>>>>>>
>>>>>>> That's the rationale behind the Wayland design.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> pq
>>>>>> Ah, thanks for the info, that may be why this is not working even in Qt
>>>>>> then.
>>>>>>
>>>>>> This seems a dropoff in Wayland to me. If a software module wants to
>>>>>> display Video into an area on the screen at its own rate, setting that
>>>>>> surface to de-synced mode is no use in the general case with this
>>>>>> policy.
>>>>> It is of use, if you don't have unnecessary sub-surfaces in synchronized
>>>>> mode in between, or you set all those extra sub-surfaces to
>>>>> desynchronized as well.
>>>> Well they may not be necessary from the Wayland perspective, but from
>>>> the higher level software they are useful to modularise/separate/provide
>>>> a join for the software modules especially when software modules are
>>>> separate like Qt and GStreamer.
>>> Sorry to hear that.
>>>   
>>>>>> I would have thought that if a subsurface was explicitly set to
>>>>>> de-synced mode then that would be honoured. I can't see a usage case for
>>>>>> it to be ignored and its commits synchronised up the tree ?
>>>>> Resizing the window is the main use case.
>>>>>
>>>>> In order to resize surface A, you also need to resize and paint surface
>>>>> B, and for surface B you also need to resize and paint surface C. Then
>>>>> you need to guarantee that all the updates from surface C, B and A are
>>>>> applied atomically on screen.
>>>>>
>>>>> Either you have component APIs good enough to negotiate the
>>>>> stop-resize-paint-resume on your own, or if the sub-components are
>>>>> free-running regardless of frame callbacks, component A can just
>>>>> temporarily set surface B to synchronized, resize and reposition it,
>>>>> and resume.
>>>> I would have thought that the Wayland server could/would synchronise
>>>> screen updates when a higher level surface is resized/moved by itself.
>>> If the whole window is moved, yes. Clients won't observe the
>>> window moving even if they wanted to.
>>>
>>> But a compositor cannot resize anything. Resizing always requires the
>>> client to respond with the surface drawn in the new size before it can
>>> actually happen. Or a whole bunch of surfaces atomically, if you use
>>> sub-surfaces.
>> I would have thought it better/more useful to have a Wayland API call
>> like "stopCommiting" so that an application can sort things out for this
>> and other things, providing more application control. But I really have
>> only very limited knowledge of the Wayland system. I just keep hitting
>> its restrictions.
>>
> Right, Wayland does not work that way. Wayland sees any client as a
> single entity, regardless of its internal composition of libraries and
> others.
>
> When Wayland delivers any event, whether it is an explicit resize event
> or an input event (or maybe the client just spontaneously decides to),
> that causes the client to want to resize a window, it is then up to the
> client itself to make sure it resizes everything it needs to, and keeps
> everything atomic so that the end user does not see glitches on screen.
>
> Sub-surfaces' synchronous mode was needed to let clients batch the
> updates of multiple surfaces into a single atomic commit. It is the
> desync mode that was a non-mandatory add-on. The synchronous mode was
> needed, because there was no other way to batch multiple
> wl_surface.commit requests to apply simultaneously guaranteed. Without
> it, if you commit surface A and then surface B, nothing will guarantee
> that the compositor would not show A updated and B not on screen for a
> moment.
>
> Wayland intentionally did not include any mechanism in its design
> intended for communication between a single client's internal
> components. Why use a display server as an IPC middle-man for something
> that should be process-internal communication. After all, Wayland is
> primarily a protocol - inter-process communication.

Well as you say it is up to the client to perform all of the surface 
resize work. So it seems to me, if the client had an issue with pixel 
perfect resizing it could always set any of its desynced surfaces to 
sync mode, or just stop the update to them, while it resizes. I don't 
see why Wayland needs to ignore the clients request to set a subsurface 
desynced down the tree. In fact does it return an error to the client 
when the Wayland server ignores this command ?

Even if the client did not do this and the Wayland surface display was 
momentarily slightly wrong, this would be no issue for any of my 
programs. Functionality over style all the time in my eyes, its no good 
how pretty it is if it doesn't work :) It would not even occur in my 
full screen non resizing application anyway.

There really is no internal client communications going through Wayland 
here. Its more of the opposite. The idea is that there are a set of 
surfaces out there that separate tasks within the client are drawing to. 
The Video one in particular has dedicated hardware doing most of the 
work. The people in GStreamer, say, involved with video processing and 
hardware can do all of this without any knowledge of the clients GUI at 
all. The Qt GUI also needs to know nothing of the Video processing 
system. They just both know about the surface and the Qt GUI simply 
moves, resizes, raises and lowers this surface with no drawing input and 
GStreamer just draws and the video sync rate. For an overall system this 
separation of duties and knowledge is quite a good method. Qt does not 
need to know the internal details of GStreamer processing and GStreamer 
does not need to know about Qt. For a Wayland server this is not much 
different to handling two separate clients where multiple surfaces are 
being composed to the screen. Its no different to a OS kernel managing a 
file through separate file descriptors.

I don't understand the Wayland details, but this seems certainly the 
best for clients, and a Wayland server should be focused on providing 
what is best for the clients needs, it is a client drawing service after 
all. This only needs a Wayland server to allow desynced surfaces at any 
level in the tree of subsurfaces as far as I can see, but I may be 
missing something.

>
>>>   
>>>> As the software components are separately developed systems it is
>>>> difficult to sync between them without changing them, but may be possible.
>>> Yes, Wayland does many things differently than older toolkits
>>> expected.
>>>
>>>
>>> ...
>>>   
>>>>> Is Gst waylandsink API the kind that it internally creates a new
>>>>> wl_surface for itself and makes it a sub-surface of the given surface,
>>>>> or is there an option to tell Gst to just push frames into a given
>>>>> wl_surface?
>>>>>
>>>>> If the former, then waylandsink is supposed to somehow give you an API
>>>>> to set the sub-surface position and z-order wrt. its parent and
>>>>> siblings. If the latter, you would create wl_subsurface yourself and
>>>>> keep control of it to set the sub-surface position and z-order.
>>>>>
>>>>> Either way, the optimal result is one top-level wl_surface, with one
>>>>> sub wl_surface drawn by Gst, and no surfaces in between in the
>>>>> hierarchy.
>>>> Yes, the Gst waylandsink API creates a new subsurface for itself from
>>>> the GUI's managed surface to separate itself from the GUI (Qt/Gnomes)
>>>> surfaces. It doesn't allow you to provide a surface to directly use. I
>>>> don't think it allows the surface to be moved/resized although it can
>>>> display video at an offset and size as far as I know (although it may
>>>> actually change the surface to do this I will have a look). It doesn't
>>>> allow the z-order to be changed I think. It expects the GUI to change
>>>> its surface and I guess assumes its subsurface would effectively move in
>>>> z and xy position due to the GUI moving/raising/lowering its surface
>>>> (the parent) in a similar manner to how X11 would have done this.
>>> Sounds like gst waylandsink is lacking z-ordering API.
>>>
>>> Wayland sub-surfaces are very different from X11 windows. One
>>> fundamental difference is that sub-surfaces can extend beyond their
>>> parent's area. Another is that sub-surfaces always have their own
>>> storage (because you have to explicitly attach wl_buffers to them),
>>> they cannot address the parent's storage like in X11. And more.
>>>
>>> X11 windows were perhaps meant for individual widgets like buttons to
>>> optimise drawing and input handling. Wayland sub-surfaces are meant for
>>> things that need a separate wl_buffer in order to be off-loaded to DRM
>>> KMS hardware for direct scanout. It's like the opposite ends of the
>>> granularity spectrum of off-loading things to the display server.
>> Yes, as far as I know X11 Windows were for individual widgets as well as
>> overall application windows. When I started programming in X11, in the
>> later 80's/early 90's there was the X11 Intrinsics toolkit that did just
>> that. It nicely separated the Widgets drawing from one another
>> modularising this all down to the protocol and display server level.
> Right, and that is the polar opposite of Wayland. Wayland was invented
> at a time when application toolkits were basically drawing complete
> pixel images of whole windows client-side and sending that image to the
> display server. Hence, the deliberate design decision to not push any
> client side architectural details into the display server, as there was
> simply no need.
>
> That modularising is supposed to happen inside a client toolkit.
>
> Routing internal things through the display server and back is just
> extra overhead and latency. One can communicate with in-process
> libraries much more efficiently than that.

I don't really think this is really a client side architectural detail, 
this seems more like a Wayland API/server limitation.

>
>> But
>> it was inefficient especially when more 3D looking screen objects were
>> wanted (moving to Motif) and so GUI toolkits started using DRM to draw
>> to the one Window. Mind you current GUI's have gone back to the plain
>> and simple early days look again!
>>
>> The concept of having a generic Window/Surface that can be in a tree
>> hierarchy is still useful though where you want to modularise software
>> and/or have separate distinct pieces of software displaying into an
>> applications GUI. It's a shame Wayland's current surface system doesn't
>> work well as a tree hierarchy for such things.
> Right. Wayland was never meant to do that. I've seen people saying bad
> things about XEmbed for instance.
>
> Wayland sub-surfaces were an answer to a very specific problem: how to
> leverage display hardware planes. Hardware planes are a much more
> efficient way of compositing (parts of) windows than CPU or GPU
> composition, but they are also much more scarce and rigid.
> Hardware planes are especially useful for videos.

Yes, and that is how I and other previous systems use this. GStreamer is 
using 2D/3D hardware and doing all of the hardware processing to a 
surface efficiently. Unfortunately some toolkits end up by having to 
break this and read the Video stream in to process it using lots of 
memcpy and CPU resources, possibly due to this Wayland limitation. These 
toolkits are then depending on the GStream API in detail, causing 
dreaded version incompatibility trees in a system as it develops.

>
>>>> I will try the middle desync and/or this method by managing the
>>>> waylandsink surface outside of waylandsink if I can and if it doesn't
>>>> mess up either Qt's or waylandsink's operations.
>>>>
>>>> Thanks for the input.
>>> Thanks,
>>> pq
>> I believe I have managed to work around this issue without having to
>> change Qt or Waylandsink API's and code, although I have only tested
>> under Fedora and not the actual embedded platform it needs to run on.
>>
>> I couldn't set the QWidgets subsurface to desynced as I cannot get its
>> subsurface as far as I can see. Qt provides a method to get a QWidgets
>> wl_surface, but not is wl_subsurface as far as I can see with a brief
>> look. Its all hidden away (unless I change Qt code) and I couldn't see a
>> way of doing this from Wayland. Maybe my discussions in the Qt Jira
>> might lead to a method in the future.
>>
>> I could probably modify Waylandsink to provide an API to manage its
>> subsurface, but Ideally I don't want to modify upstream code unless
>> really needed. Maybe in the long term this is the way to go although
>> fixing at the Q tlevel sounds better if possible (It would probably need
>> Qt and waylandsink mods)
> I do wonder if gst waylandsink could use API improvements upstream.

Maybe this would help, but also Wayland could do with API improvements :)

>
>> What I have done for now, baring a quick cleaner/better method, is to do
>> work at the Wayland level in my test application. Here for my Video
>> subwidget, I create a new surface/subsurface from the Qt toplevel
>> surface, set it to desynced mode and pass that to waylandsink. As I now
>> have access to the wl_subsurface wayland sink is using as its parent, I
>> can raise and lower it, position it and resize it giving me some degree
>> of control. I have had to go through all the Wayland wl_registry work to
>> get compositor, subcompositor API's to do this (As Qt does not provide
>> access to all of these). I am unsure if this method will provide the
>> ability for my video to sit behind transparent background QWidgets, but
>> I can work without that ability for now in the system I am developing, I
>> just hope I don't see other issues with this approach,
> I'm glad you found something.

Yes, but its a big bodge. It would be nice to do this in a clean, simple 
and standard way.

>
>> Wayland/Qt/Gstreamer has a knack of getting in your way! I might have to
>> write/modify an alternative Weston compositor to get around this Wayland
>> feature/flaw, and support simple top level surface moves etc. (which are
>> also causing problems, I already have had to add a different shell to
>> allow the application to move its windows to a separate HDMI screen)
>> although obviously that is a hack as well but baring a proper way to do
>> this at least it should work.
> Yes, Wayland window management development is very much driven by
> generic desktops, which have some conflicting goals with many embedded
> designs. On desktops, apps simply do not have all the information to
> position their windows while the compositor knows better. Outside of
> desktops, the system designer can give the chosen apps the desired
> behaviour before the system is deployed.
>
> Weston has kiosk-shell, which allows you to configure which app window
> should go on which output, IIRC. The basic assumption in kiosk-shell is
> that any active app is fullscreen.
>
> For other Weston uses, there is a rough idea:
> https://gitlab.freedesktop.org/wayland/weston/-/issues/520
>
> Someone might even look into that this year.

Yes, I have produced my own special shell based on the kiosk shell but 
the kiosk shell is pretty limited and actually has bugs on the NXP 
platform (no background surface drawing).

Personally I think this side of Wayland is its worst area. I think its a 
shame it didn't look at X11 and take at least the good ideas from that. 
In particular I believe there should be one standard fully featured 
Wayland server that can be used on all systems with all desktop/embedded 
systems (there can also be alternatives). This would have an external 
Window manager API to manage the Windows so that each desktop/embedded 
system could then do what it wanted or an embedded application could use 
this directly. Having just the one server would allow faster Wayland bug 
fixing, better stability and less re-inventing the wheel.

I wish I had time to develop this, even if just for our own usage.

>
>
> Thanks,
> pq

-- 
Dr Terry Barnaby            BEAM Ltd
Phone: +44 1454 324512      Northavon Business Center,
Email: terry at beam.ltd.uk    Dean Rd, Yate
Web: www.beam.ltd.uk        Bristol, BS37 5NH, UK
BEAM Engineering: Instrumentation, Electronics/Software/Systems