[PATCH] protocol: Add DnD actions

Jonas Ådahl jadahl at gmail.com
Fri Apr 17 00:50:50 PDT 2015

On Thu, Apr 16, 2015 at 12:55:31PM +0200, Carlos Garnacho wrote:
> Hey Jonas,
> On Thu, Apr 16, 2015 at 10:15 AM, Jonas Ådahl <jadahl at gmail.com> wrote:
> <snip>
> >
> > I'd have to agree on that it doesn't seem like the best thing to let the
> > compositor choose the preferred action. Having it apply compositor
> > specific policy given what the keyboard state or similar will probably
> > never work out very well, given that for example what modifier state
> > means what type of action is very application dependent.
> >
> > On the other hand, I'm not sure we can currently rely on either side
> > having keyboard focus during the drag. In weston the source will have the
> > focus because starting the drag was done with a click which gave the
> > surface keyboard focus implicitly, but what'd happen if the compositor
> > has keyboard-focus-follows-mouse? We could probably say that drag implies
> > an implicit grab on another device on the same seat to enforce no
> > changing of keyboard focus, but not sure that is better.
> In gtk+/gnome we currently have the following keybindings active during DnD:
> - Cursor keys move the drag point, modifiers affect speed
> - Esc key cancels drag
> - Modifiers alone pick an action from the offered list
> So ok, the latter is dubious to punt to compositors, but there's
> basically no other choice with the 2 first ones.
> More generally, I have the opinion that compositors grabs should
> behave all consistently, as in:
> - Ensuring clients reset all input state (we eg. don't cancel ongoing
> touches when xdg_popup/dnd/... grabs kick in)

What does "client reset all input state" mean? What state can a client

> - Ensuring the grab affects the routing of all devices/events, and
> that no client gets partial streams

You mean that pointer grab should implicitly grab the touch and keyboard
device as well? What do you mean with partial stream?

Do I understand you correctly in that you mean when a DND is triggered,
the surface looses its grab? That would be consistent with other server
driven user interactions such as move and resize.

> For the touch case, depending on how the grab is implemented, with the
> current guidelines the only 2 choices are "leave the client in
> inconsistent state" or "make the client still receives ongoing touches
> despite the pointer grab" (same applies if the grab is touch
> triggered, only with the other touches that didn't trigger the grab).
> More on topic, keyboards are also funky if we keep focus on clients,
> you can conceivably Esc/Ctrl-Q/... to close the app you're dragging
> from. IMO the way forward is precisely this, the compositor becomes in
> control of the keyboard, and we offer the missing semantics to cover
> for this.

Meaning its the compositor that decides whether a drag is a copy or a
move? I.e. either we hard code "Ctrl" to be copy in the protocol, or DND
will behave different on each compositor. Not sure I like any of those

> >
> > I also don't see the variable state to be a good thing considering we'd
> > have three independent states, meaning it'd get a very racy and
> > non-deterministic protocol.
> Ideally all of this would have been right from the start as parameters
> to wl_data_device.start_drag, wl_data_device.enter and
> wl_data_offer.accept. Despite the extra combinations in data flow, I
> fail to see how this gets racy or non-deterministic, you surely will
> get a supported action and mimetype on the drag dest, or the drag will
> be cancelled. If it is more conforting, we can make it more explicit
> that wl_data_offer.notify_actions is the central point where DnD
> success/action is decided, and that wl_data_offer.accept/receive are a
> second step after it.

We can extend a request by "prefixing" it with another request that
depends on the final request (start_drag for example) to take affect.
We can do the same for events, i.e. we first send an event that only
takes effect when another event is sent. This is for example how we are
extending wl_pointer.axis with axis source information.

This way we can effectively add new parameters to start_drag or accept,
just that in the protocol we make them separate requests/events.

> >
> > If we'd want to have the destination choose the action, the source
> > should advertise its possible actions, forwarded by the compositor to
> > the destination ("atomically", without intermediate committed state).
> wl_data_offer.source_actions?
> >
> > If we want to enable one of the clients to rely on keyboard modifier
> > state, I think this should be communicated to the deciding end point;
> Which used to be the drag source in XDND, as the holder of
> pointer/keyboard grabs. It would update the "preferred action" that
> was communicated then to the drag dest. If we do this 1:1 we probably
> don't get rid of any of the "racyness" you see, and I suspect punting
> it to the drag dest will involve a few changes in toolkits, plus
> separate event handling paths from X11's.

Hmm. Just to get a better understanding before going further into this,
what is the reasoning behind wanting the destination to be able to
choose / prefer an action, when in XDND it was the source?

> > which I suspect is what Bill is talking about regarding  the 'state'
> > that is sent from the compositor.
> >
> >> >
> >> > I am VERY much in favor of moving as much logic as possible from the
> >> > compositor to the clients. And f(A,B,state) is a very complicated
> >> > function. B may not be a list, it could be, in effect, infinite in
> >> > size
> >>
> >> Are you maybe folding mimetypes and actions as A/B/C above? The only
> >> thing that can grow "unbounded" is the mimetype list, the possible
> >> actions are always a fixed set, and resolved after the mimetype is
> >> negotiated. AFAICS "B" corresponds to the dest side, which confuses
> >> me, because both the picked mimetype and action will always be a
> >> subset of A's.
> >>
> >> > (a client conceivably could ask the user to type a filename that the
> >> > drop should go to), can vary quickly (as the user moves across
> >> > widget
> >> > boundaries), and can contain items the compositor has no business
> >> > knowing about (a paint program may ask how to tile a dropped
> >> > pattern).
> >>
> >> Ah, I see, perhaps it's rather "varying over time" than "infinite"?
> >> TBH I don't see how this is different to how mimetypes are dealt with,
> >> you definitely don't have to calculate all possible states at once,
> >> just for the position you're in.
> >
> > For clarification: I think that mime types should be considered a
> > non-varying static set. They are advertised after creating the data
> > source before the data source is enabled (via set_selection or
> > start_drag).
> They are :). My point above about mimetypes is that each
> data_device.motion can make the drag dest pick another mimetype
> through wl_data_offer.accept, exactly the same thing is expected from
> wl_data_offer.notify_actions, And for both the drag would be
> considered "cancelled" if the dest provides NULL/0.
> > The destination will only receive the mime types as one
> > batch before the offer being enabled (via enter or selection). This might
> > be less than clearly written in the protocol, but its my understanding of
> > it (see for example the wl_data_device.data_offer documentation).
> Yes, you get the mimetypes between wl_data_device.data_offer/enter.
> For clarification, that's also the time where the drag dest is
> expected to get wl_data_offer.source_actions in my last draft.

I see. So a "source_actions" event may only be sent directly after a
wl_data_source is created? Meaning that the set of supported actions
will never again change?


More information about the wayland-devel mailing list