Screen shooting and recording protocols (Re: Authorized clients)

Martin Peres martin.peres at free.fr
Wed Jan 22 10:35:23 PST 2014


Le 09/01/2014 22:14, Martin Peres a écrit :
> Le 09/01/2014 21:47, Maarten Baert a écrit :
>> On 09/01/14 10:00, Pekka Paalanen wrote:
>>> Those are some reasons why screen recording (video) is easier to do as
>>> a compositor plugin, like it is currently in Weston. A separate client
>>> would need a non-trivial amount of new Wayland protocol to work well.
>> That's probably true, but you can't expect applications to write a
>> separate plugin for each compositor. Besides, I highly doubt that it's a
>> good idea to load ffmpeg/libav and all its dependencies into the
>> compositor - these libraries aren't exactly known for their stability 
>> ;).
>
> Very valid point!
>>
>>> Instead, a client could ask the compositor to ask the user which window
>>> she wants to capture, and then the compositor would capture only that.
>>> Capturing individual outputs is a lot easier: Wayland core protocol
>>> already exposes all outputs, so the client can directly ask for a
>>> certain output.
>> The window picking function that I have now (for X11) is really just a
>> way to quickly enter the correct coordinates and size of the area that
>> should be recorded. I don't expect the user to move the window around.
>> And just to be clear, the goal is NOT to capture only the buffer of a
>> single window, because then 'subwindows' (like browser plugins) and
>> dialog windows won't be recorded. If I really wanted to capture just a
>> single SHM buffer, I would probably just do it client-size, in the same
>> way I already do OpenGL recording now (because this gives me much more
>> flexibility).
>>
>> So what I'm asking for is just a function to get the rectangle (x,y,w,h)
>> that corresponds to the window directly below a given position (x,y).
>> The compositor doesn't even have to handle the complexity of 'real' user
>> interaction (i.e. showing a message to the user telling him to pick a
>> window, waiting for the user to do that, dealing with clients that make
>> a request and then die, ...). Such a function would do everything I
>> need, and I think it also covers what the existing screenshot
>> applications need. I prefer to do it like this because it is the most
>> simple way to implement this for the compositor, and it is more flexible
>> (e.g. applications can choose to select the recording area in advance
>> and then repeatedly use the same area without telling the user to select
>> it over and over again).
>
> I'm not saying supporting the acquisition of just a rectangle isn't a
> good idea but if what the user wants is the output of a window, it is 
> stupid
> to grab the whole screen. Shouldn't we try to make stuff just work,
> whatever the user does?
>>
>>> In the part I cut out, there were some concerns about who and how
>>> should decide what hotkeys may or may not start shooting or recording.
>>> I think there was recently'ish a nice suggestion about a generic
>>> solution to the global hotkey problem, so you can probably leave it for
>>> that. I'm referring to the "clients bind to actions, server decides 
>>> what
>>> hotkey is the action" kind of an idea, IIRC. There is a myriad of
>>> details to solve there, too.
>> That would make a lot more sense, at least it is a lot more flexible
>> than requiring the recording application to be launched by the same key
>> press that starts the recording (which would effectively force me to
>> split my application into two separate processes, and then I would have
>> to figure out a secure way to let these two processes communicate).
>>
>> But what about things like mouse clicks? Can the compositor tell that
>> the user clicked the 'start recording' button?
>
> It can't but we are talking about video recording. The app should just
> drop the frames when it is not interested in them. Once it has been
> run with the right hotkey, it will receive all the frames.
>
> As for security, the compositor should provide some feedback in the
> notification tray until the apps stops, I guess.

This is exactly what I want to avoid: 
https://www.youtube.com/watch?v=s5D578JmHdU
Chrome not properly revoking the rights to access the mic nor
displaying its internal state properly.

Static access control is not sufficient for that (the website got the
permission from the user). Chromium should let the user express
his intent that all sound recording should be stopped at this point.
Privilege revocation is not handled properly and allowing the user
to view the current state of the system is apparently not done
properly.

I should really take the time summarize what are the options we
discussed, their associated risks and what should be done to mitigate
them.

Martin


More information about the wayland-devel mailing list