[PATCH] protocol: Add DnD actions

Mon Apr 20 23:29:42 PDT 2015

On Sat, Apr 18, 2015 at 04:53:46PM +0200, Carlos Garnacho wrote:
> Hey Jonas,

Hi,

Thanks for the explanations. I'll reply inline.

> 
> On vie, 2015-04-17 at 15:50 +0800, Jonas Ådahl wrote:
> 
> <snip>
> > 
> > > For the touch case, depending on how the grab is implemented, with 
> > > the
> > > current guidelines the only 2 choices are "leave the client in
> > > inconsistent state" or "make the client still receives ongoing 
> > > touches
> > > despite the pointer grab" (same applies if the grab is touch
> > > triggered, only with the other touches that didn't trigger the 
> > > grab).
> > > 
> > > More on topic, keyboards are also funky if we keep focus on 
> > > clients,
> > > you can conceivably Esc/Ctrl-Q/... to close the app you're dragging
> > > from. IMO the way forward is precisely this, the compositor 
> > > becomes in
> > > control of the keyboard, and we offer the missing semantics to 
> > > cover
> > > for this.
> > 
> > Meaning its the compositor that decides whether a drag is a copy or a
> > move? I.e. either we hard code "Ctrl" to be copy in the protocol, or 
> > DND
> > will behave different on each compositor. Not sure I like any of 
> > those
> > options.
> 
> Yes, this would be implementation-dependent in the compositor as my 
> proposal goes. We have 3 players here, whoever gets to handle the 
> modifier->action translation, there's room for confusion in cross-
> DE/toolkit cases.

The problem as I see it is that it'd be even confusing for DND within
the same application as it would be depending on the DE how it'd work.

> 
> Focusing on actions, I see the following possible data flows here 
> (depicting the same situation on all: initial negotiation, changes on 
> the dest on say pointer motion, and a modifier change):
> 
> 1. If handled purely by the source:
> 
> wl_data_source      compositor            wl_data_offer
> ==============      ==========            =============
>                  -> notify_actions     <-
> dest_actions     <-
>                  -> preferred_action
> action           <-                    -> action
> 
>                         ...
>                                           (pointer moves across widgets)
>                     notify_actions     <-
> dest_actions     <-
>                  -> preferred_action
> action           <-                    -> action
> 
>                         ...
>                     (modifiers change)
> modifiers        <-
>                  -> preferred_action
> action           <-                    -> action
> 
> 
> 2. If handled purely by the dest:
> 
> wl_data_source      compositor            wl_data_offer
> ==============      ==========            =============
>                  -> notify_actions
>                                        -> source_actions
>                     notify_actions     <-
>                     preferred_action   <-
> action           <-                    -> action
> 
>                         ...
>                                           (pointer moves across widgets)
>                     notify_actions     <-
>                     preferred_action   <-
> action           <-                    -> action
> 
>                         ...
>                     (modifiers change)
>                                        -> modifiers
>                     preferred_action   <-
> action           <-                    -> action
> 
> 
> 3. If handled purely by the compositor:
> 
> wl_data_source      compositor            wl_data_offer
> ==============      ==========            =============
>                  -> notify_actions     <-
> action           <-                    -> action
> 
>                         ...
>                                           (pointer moves across widgets)
>                     notify_actions     <-
> action           <-                    -> action
> 
>                         ...
>                     (modifiers change)
> action           <-                    -> action
> 
> 
> Options #1 and #2 involve roundtrips, option #3 doesn't. Options #1 
> and #2 would still need some validation on the compositor to avoid 
> picking options unknown to either side.

I think its wrong to refer to these as roundtrips. A roundtrip is
typically a client that need to wait for a reply from a server, but here
in any of the three options no one is waiting for anything, thus we have
no round trips at all. The main differences as I see it are:

In option 1 and 2 we pass an additional modifier state, and make either
side be responsible for choosing. In option 3 we move this and make it
compositor choose (either with hard coded policy in the protocol or some
arbitrary policy given some private state inside the compositor).

In option 1 and 2, we have a slightly longer delay in visual feedback
regarding the action (caused by the outsourcing of the decision making).
I'd say these delays are in most cases insignificantly small. In any
solution we end up with latency as we are dealing with 3 entities
communicating asynchronously. Note that option 3 has this delay as well,
but for chosen mime type visual feedback.

In option 2, "actions" would be handled identical to mime types. The
modifier change would be semantically equivalent to a motion event. We'd
have all the decision making (mime type, action) on one side. I suppose
this should be considered most "consistent" with the existing DND
protocol if that matters in any way.

In option 1, we'd be slightly more close to how XDND works, but splits
the decision making between the two end points which might not be very
nice. We'd also have one extra (probably also insignificant) delay since
the compositor needs to make the let the source decide the action before
performing the drop on the destination. We'd also need to delay the drop
so the source can make a final decision, which doesn't seem very nice.

Assuming the extra visual feedback delay can be considered insignificant
I think we have 3 major paths to take:

1) Actions are chosen arbitrarily by the compositor
2) Policy is hard coded into the protocol (Ctrl means copy etc)
3) Pass additional state (modifier) to the decision making client

Personally, by just looking at how the protocol would look and how data
would flow, option 3 with the destination making the choice seems to
make most sense to me, since it'd be most consistent with how it
currently works and it doesn't split up the decision making nor add
policy or undefined behavior to the protocol.

> 
> So option #3 seems neater, although in practice drag destinations are 
> non-uniform, some action negotiation similar to #2 is needed, so my 
> proposal goes:
> 
> wl_data_source      compositor            wl_data_offer
> ==============      ==========            =============
>                  -> notify_actions
>                                        -> source_actions
>                     notify_actions     <-
> action           <-                    -> action
> 
>                         ...
>                                           (pointer moves across widgets)
>                     notify_actions     <-
> action           <-                    -> action
> 
>                         ...
>                     (modifiers change)
> action           <-                    -> action
> 

In order to get a better picture of the complete flow (with mime type
negotiation included) given my understanding of it I wrote down
something similar to your flow graphs, but with all the events and
requests included. I changed the naming to make events/requests extending
some other event/request more clear, as well as more consistent with the
existing protocol. Again, the the option where the destination makes the
decision seems most in line with how the rest of the protocol works,
which is illustrated in the message flow.

Current flow:

Source                  compositor            Destination
==============          ==========            =============

-> new wl_data_source
-> wl_data_source.offer
-> wl_data_source.offer
-> wl_data_device.start_drag

                 ** Enters destination surface **
                     -> new wl_data_offer
                     -> wl_data_offer.offer
                     -> wl_data_offer.offer
                     -> wl_data_device.enter
                                              <- wl_data_offer.accept
                     <- wl_data_source.target
 ** Update icon**

                 ** Move mouse **
                     -> wl_data_device.motion
                                             <- wl_data_offer.accept
                     <- wl_data_source.target
 ** Update icon **

                 ** Release button **
                     -> wl_data_device.drop
                                             <- wl_data_offer.receive
                     <- wl_data_source.send

Source decides action flow (destination gets no feedback(*)):

(*) It won't be possible to send a valid action until receiving the
first wl_data_source.target event. Therefore I skipped the possibility
for visual feedback regarding chosen action on the destination side in
this example. The same issue exists in the other options, but are not
communicated over the wire.

Source                  compositor            Destination
==============          ==========            =============

-> new wl_data_source
-> wl_data_source.offer
-> wl_data_source.offer
-> wl_data_device.start_drag

                 ** Enters destination surface **
                     -> new wl_data_offer
                     -> wl_data_offer.offer
                     -> wl_data_offer.offer
                     -> wl_data_device.enter
                                              <- wl_data_offer.action
                                              <- wl_data_offer.action
                                              <- wl_data_offer.accept
                     <- wl_data_source.target_action
                     <- wl_data_source.target_action
                     <- wl_data_source.target
 ** Update icon **

                 ** Move mouse **
                     -> wl_data_device.motion
                                              <- wl_data_offer.action
                                              <- wl_data_offer.action
                                              <- wl_data_offer.accept
                     <- wl_data_source.target_action
                     <- wl_data_source.target_action
                     <- wl_data_source.target
 ** Update icon **

                 ** Press modifier **
                     <- wl_data_source.modifier
 ** Update icon **

                 ** Release button **
                     <- wl_data_source.drop

-> wl_data_source.drop_with_action
                     -> wl_data_offer.action
                     -> wl_data_device.drop
                                              <- wl_data_offer.receive
                     <- wl_data_source.send

Destination decides (all sides gets feedback):

Source                  compositor            Destination
==============          ==========            =============

-> new wl_data_source
-> wl_data_source.offer
-> wl_data_source.offer
-> wl_data_source.action
-> wl_data_source.action
-> wl_data_device.start_drag

                 ** Enters destination surface **
                     -> new wl_data_offer
                     -> wl_data_offer.offer
                     -> wl_data_offer.offer
                     -> wl_data_offer.action
                     -> wl_data_offer.action
                     -> wl_data_device.enter
                                              <- wl_data_offer.accept_action
                                              <- wl_data_offer.accept
                     <- wl_data_source.target_action
                     <- wl_data_source.target
 ** Update icon **

                 ** Move mouse **
                     -> wl_data_device.motion
                                              <- wl_data_offer.accept_action
                                              <- wl_data_offer.accept
                     <- wl_data_source.target_action
                     <- wl_data_source.target
 ** Update icon **

                 ** Press modifier **
                     -> wl_data_device.modifier
                                              <- wl_data_offer.accept_action
                                              <- wl_data_offer.accept
                     <- wl_data_source.target_action
                     <- wl_data_source.target
 ** Update icon **

                 ** Release button **
                     -> wl_data_device.drop
                                              <- wl_data_offer.accept_action
                                              <- wl_data_offer.receive
                     <- wl_data_source.target_action
                     <- wl_data_source.send

Compositor decides (everyone gets feedback):

Source                  compositor            Destination
==============          ==========            =============

-> new wl_data_source
-> wl_data_source.offer
-> wl_data_source.offer
-> wl_data_source.action
-> wl_data_source.action
-> wl_data_device.start_drag

                     -> new wl_data_offer
                     -> wl_data_offer.offer
                     -> wl_data_offer.offer
                     -> wl_data_offer.action
                     -> wl_data_offer.action
                     -> wl_data_device.enter
                                              <- wl_data_offer.action
                                              <- wl_data_offer.action
                                              <- wl_data_offer.accept
                     -> wl_data_offer.target_action
                     <- wl_data_source.target_action
                     <- wl_data_source.target
 ** Update icon **

                 ** Move mouse **
                     -> wl_data_device.motion
                                              <- wl_data_offer.action
                                              <- wl_data_offer.action
                                              <- wl_data_offer.accept
                     -> wl_data_offer.target_action
                     <- wl_data_source.target_action
                     <- wl_data_source.target
 ** Update icon **

                 ** Press modifier **
                     -> wl_data_offer.target_action
                     <- wl_data_source.target_action
 ** Update icon **

                 ** Release button **
                     -> wl_data_device.drop
                                              <- wl_data_offer.receive
                     <- wl_data_source.send

> 
> > 
> > > 
> > > > 
> > > > I also don't see the variable state to be a good thing 
> > > > considering we'd
> > > > have three independent states, meaning it'd get a very racy and
> > > > non-deterministic protocol.
> > > 
> > > Ideally all of this would have been right from the start as 
> > > parameters
> > > to wl_data_device.start_drag, wl_data_device.enter and
> > > wl_data_offer.accept. Despite the extra combinations in data flow, 
> > > I
> > > fail to see how this gets racy or non-deterministic, you surely 
> > > will
> > > get a supported action and mimetype on the drag dest, or the drag 
> > > will
> > > be cancelled. If it is more conforting, we can make it more 
> > > explicit
> > > that wl_data_offer.notify_actions is the central point where DnD
> > > success/action is decided, and that wl_data_offer.accept/receive 
> > > are a
> > > second step after it.
> > 
> > We can extend a request by "prefixing" it with another request that
> > depends on the final request (start_drag for example) to take affect.
> > We can do the same for events, i.e. we first send an event that only
> > takes effect when another event is sent. This is for example how we 
> > are
> > extending wl_pointer.axis with axis source information.
> 
> Ok, if that's the effective way to add additional arguments, then I 
> don't see at all how you consider this racy :).

No, that was more me not understanding the way you intended it to work.
Sorry about that.

Jonas

> 
> > 
> > This way we can effectively add new parameters to start_drag or 
> > accept,
> > just that in the protocol we make them separate requests/events.
> > 
> > > 
> > > > 
> > > > If we'd want to have the destination choose the action, the 
> > > > source
> > > > should advertise its possible actions, forwarded by the 
> > > > compositor to
> > > > the destination ("atomically", without intermediate committed 
> > > > state).
> > > 
> > > wl_data_offer.source_actions?
> > > 
> > > > 
> > > > If we want to enable one of the clients to rely on keyboard 
> > > > modifier
> > > > state, I think this should be communicated to the deciding end 
> > > > point;
> > > 
> > > Which used to be the drag source in XDND, as the holder of
> > > pointer/keyboard grabs. It would update the "preferred action" that
> > > was communicated then to the drag dest. If we do this 1:1 we 
> > > probably
> > > don't get rid of any of the "racyness" you see, and I suspect 
> > > punting
> > > it to the drag dest will involve a few changes in toolkits, plus
> > > separate event handling paths from X11's.
> > 
> > Hmm. Just to get a better understanding before going further into 
> > this,
> > what is the reasoning behind wanting the destination to be able to
> > choose / prefer an action, when in XDND it was the source?
> 
> In XDND, it is the drag source which handles all events: it grabs both 
> the pointer and keyboard, on motion events it will find out the window 
> underneath and check whether it accepts DnD. If it does, the drag 
> source will communicate with the dest through client messages, but it 
> is always the drag source which receives the events and channels the 
> info.
> 
> In Wayland, it is in turn the drag destination which receives a 
> continuous stream of events as DnD progresses, channeled through the 
> compositor, either of those seem in a better position to propose one 
> without extra roundtrips.
> 
> But it is true that one of the main purposes of the "preferred action" 
> in XDND is forwarding the one selected through modifiers. This 
> parameter on wl_data_offer.notify_actions could perhaps be just 
> removed, if we go ahead at having the compositor decide the policy.
> 
> > 
> > > 
> > > > which I suspect is what Bill is talking about regarding  the 
> > > > 'state'
> > > > that is sent from the compositor.
> > > > 
> > > > > > 
> > > > > > I am VERY much in favor of moving as much logic as possible 
> > > > > > from the
> > > > > > compositor to the clients. And f(A,B,state) is a very 
> > > > > > complicated
> > > > > > function. B may not be a list, it could be, in effect, 
> > > > > > infinite in
> > > > > > size
> > > > > 
> > > > > Are you maybe folding mimetypes and actions as A/B/C above? 
> > > > > The only
> > > > > thing that can grow "unbounded" is the mimetype list, the 
> > > > > possible
> > > > > actions are always a fixed set, and resolved after the 
> > > > > mimetype is
> > > > > negotiated. AFAICS "B" corresponds to the dest side, which 
> > > > > confuses
> > > > > me, because both the picked mimetype and action will always be 
> > > > > a
> > > > > subset of A's.
> > > > > 
> > > > > > (a client conceivably could ask the user to type a filename 
> > > > > > that the
> > > > > > drop should go to), can vary quickly (as the user moves 
> > > > > > across
> > > > > > widget
> > > > > > boundaries), and can contain items the compositor has no 
> > > > > > business
> > > > > > knowing about (a paint program may ask how to tile a dropped
> > > > > > pattern).
> > > > > 
> > > > > Ah, I see, perhaps it's rather "varying over time" than 
> > > > > "infinite"?
> > > > > TBH I don't see how this is different to how mimetypes are 
> > > > > dealt with,
> > > > > you definitely don't have to calculate all possible states at 
> > > > > once,
> > > > > just for the position you're in.
> > > > 
> > > > For clarification: I think that mime types should be considered a
> > > > non-varying static set. They are advertised after creating the 
> > > > data
> > > > source before the data source is enabled (via set_selection or
> > > > start_drag).
> > > 
> > > They are :). My point above about mimetypes is that each
> > > data_device.motion can make the drag dest pick another mimetype
> > > through wl_data_offer.accept, exactly the same thing is expected 
> > > from
> > > wl_data_offer.notify_actions, And for both the drag would be
> > > considered "cancelled" if the dest provides NULL/0.
> > > 
> > > > The destination will only receive the mime types as one
> > > > batch before the offer being enabled (via enter or selection). 
> > > > This might
> > > > be less than clearly written in the protocol, but its my 
> > > > understanding of
> > > > it (see for example the wl_data_device.data_offer documentation).
> > > 
> > > Yes, you get the mimetypes between wl_data_device.data_offer/enter.
> > > For clarification, that's also the time where the drag dest is
> > > expected to get wl_data_offer.source_actions in my last draft.
> > 
> > I see. So a "source_actions" event may only be sent directly after a
> > wl_data_source is created? Meaning that the set of supported actions
> > will never again change?
> 
> It can change over time currently, triggering again the emission of 
> that event. Although I doubt this is useful, we can make it an error 
> if data_source.notify_actions is called more than once, or is called 
> after start_drag.
> 
> Cheers,
>   Carlos