Screen shooting and recording protocols (Re: Authorized clients)

Fri Jan 10 08:10:36 PST 2014

On Fri, 10 Jan 2014 15:26:19 +0100
Maarten Baert <maarten-baert at hotmail.com> wrote:

> On 10/01/14 09:54, Pekka Paalanen wrote:
> > I think it is realistic on good platforms, and also essential for
> > performance. Needs to be tried out, of course.
> X11 does capturing by simply copying the buffer to a SHM image. This
> takes about 2ms for a 1920x1080 frame. That's perfectly usable,
> considering that any video encoder will need far more time to encode
> that same frame. So it's nice to have zero overhead capturing, but not
> essential IMO.
> 
> > What would you capture, when the window is already rotated? The
> > aligned bounding box perhaps?
> >
> > Clients do not have a coordinate transformation, that could relate a
> > window to the global desktop or output coordinate space. Clients
> > simply do not know about those coordinate spaces, and the mapping
> > between a window and them may not be linear even in homogenous
> > coordinates.
> If a window is rotated, then the user will probably expect a
> screenshot of a rotated window, not one of the original un-rotated
> window. So yes, I think the axis-aligned bounding box is the best
> choice in that case.
> 
> > There is a slight problem with "as it appears on screen". One window
> > may appear multiple times in different forms on screens (the weston
> > views mechanism). Exposé effect temporarily scales the window into a
> > size that is not readable, do you want exposé to affect the capture?
> > There can be live preview views of a window that are not useful to
> > record.
> So it would be up to the compositor to decide what rectangle is the
> right one.
> 
> > If the window is rotated, do you want it also captured rotated?
> Yes. Why on earth would the user decide to rotate a window if he
> doesn't want it to be captured like that?

Well, if the user wants to capture it like that, maybe she rather takes
a shot of the whole desktop?

Maybe I'm just weird, but when I think about capturing a single window,
I really think about the window contents, not how it is presented on
screen. If I want how the thing is presented on screen, I capture the
screen and maybe crop afterwards.

I find it annoying, that I have to raise a window to take a shot of
it without obscuring it with other windows.

> > Another thing is that if the window is partially or completely
> > obscured by other windows. If you are recording that window, should
> > you be getting the window content, or what happens to be visible at
> > its position on screen?
> Depends on the goal. It would be nice to have both features available.

What you call capturing a window, is just capturing a sub-rect of the
screen for me. For me, capturing a window is a different thing, it gets
the original window contents.

> > What if the window is on a virtual desktop that is
> > not visible at all?
> Then how would you pick the window in the first place?

It was visible when the user started the capture, then switched virtual
desktops to do something else while the window was being recorded, e.g.
some automatic long GUI test run. Again, that goes to my interpretation
of what window capture does.

> > In your proposition, how do you define what is part of the window or
> > not? It needs to relate to the protocol objects somehow.
> I assumed it would be impossible (or at least very impractical) to
> capture tooltips and menus as part of a window. I mean, what if the
> window is transformed and the tooltips are not? I wanted to avoid the
> issue entirely by simply recording the screen as-is, by choosing a
> rectangle on the screen based on x/y/w/h. Do you think it is realistic
> to require that all compositors implement the complex logic to capture
> and follow a single window, without transformations, but with support
> for tooltips and menus? I fear that most compositors won't implement
> it at all, and if they do, it will likely be buggy because it hasn't
> seen enough testing.

It could be optional. But sure, it will be very complicated, and that
is what "window capturing" means to me in the broadest sense. Maybe we
should forget window capturing.

> > Menus, tooltips, drop-downs etc. are a good question, since they can
> > extrude outside of the original window on any side like left or top
> > - how should that affect the capture? Would it appear as if the main
> > window temporarily jumped to the right/down when you record a video?
> Tooltips and menus that are not fully inside the recorded rectangle
> would be partially unreadable, that's correct. There's not much that
> can be done about that, right? You can't change the size of a video
> once it has started, encoders don't allow that (and video players
> don't support it).

Yeah... but supporting resizing is a must for window recording, since
you can resize the window. One more nail to the coffin of window
recording.

> > It seems capturing "a window" requires close cooperation with the
> > shell (plugin), and a probably a non-trivial amount of metadata to
> > create a pleasing video. I don't have a good proposition on how it
> > should work.
> >
> > Maybe the "just choose an aligned rect" way is really the easiest
> > while being suffient for all real use cases.
> That's exactly why I suggested it. It's simple to implement and good
> enough.

Right, I got confused by terminology here.

Choosing the rectangle might be a bit complicated, unless you limit to
aligned rectangles on the scanout buffer. If your display device is a
fish-eye projector or something more wacky, the image you get might not
be straight nor have a reasonable representation on a plane. Of course,
shooting the "whole screen" will be an interesting problem, too.

Ok ok, let's come back to real world. ;-)

> > Oh, OTOH color management... zero-copy output capture would of
> > course capture the framebuffer drawn for the output color space, and
> > additionally that could be e.g. 10 bits per channel.
> Is it feasible to temporarily 'downgrade' to whatever capture format
> the screenshooter/screen recorder requests? That will probably be
> BGRA (8-bit).

That probably requires a video mode switch, which is totally possible.
Video mode is in complete control of the compositor.

> For the color space I just assume sRGB on X11, and AFAIK image viewers
> and video players do the same, so no-one complains that their
> screenshots/videos look wrong. Color management in video is a
> nightmare anyway, there are a few (poorly defined) versions of YUV
> that you can use, but you can't embed an ICC profile or anything like
> that AFAIK :(.
> 
> Do 10-bit displays actually exist? I thought even those 8-bit displays
> are really only 6-bit plus dithering.

There has been talk about 10-bit output, so I guess it would be useful
to someone. And support for it is coming if not there already AFAIK.

IOW, you would be willing to downgrade the image quality on the monitor
while recording is on. That is perfectly doable, if it is wanted.

Thanks,
pq