[PATCH weston 21/25] protocol: add weston_touch_calibration

Fri Apr 13 04:31:39 UTC 2018

On Wed, Apr 11, 2018 at 02:00:49PM +0300, Pekka Paalanen wrote:
> On Wed, 11 Apr 2018 10:16:46 +1000
> Peter Hutterer <peter.hutterer at who-t.net> wrote:
> > > > > +	An event carries the touch device identification and the associated
> > > > > +	output or head (display connector) name.
> > > > > +
> > > > > +	On platforms using udev, the device identification is the udev DEVPATH.    
> > > > 
> > > > do we want to really set this in stone in the protocol? Also, the DEVPATH
> > > > is without /sys, right? Why not make this an open format and let the client
> > > > figure it out. It's not hard to strcmp for /sys or /dev/input/event in the
> > > > client and iirc we have precedence for that in the tablet protocol. or in
> > > > libinput for tablets, can't remember now :)  
> > > 
> > > Yes, I wanted to have a clear, unambgious definition. What udev calls
> > > the DEVPATH is the path without /sys. We can revise this decision
> > > if there ever arises a need for anything else, but it's still a Weston
> > > private protocol for now.
> > > 
> > > When would one use /dev paths?  
> > 
> > *shrug*, not sure in this case, just used that as an example here. Point was
> > more: prefixing with /sys makes it a lot more identifiable than just using
> > the DEVPATH, not least because you can immediately stat that path (or ls, important
> > for bug reporters).
> > 
> > also, udev_device_new_from_syspath() seems to error out if it doesn't start
> > with "/sys", so any programmatic user would have to pre-pend /sys anway.
> 
> Ok, I can change it to be "DEVPATH prefixed with /sys". Or is SYSPATH a
> thing defined by udev? There is udev_device_get_syspath() but I cannot
> get 'udevadm info' to print that. Originally I wanted to stick to what
> I can see with 'udevadm info'.

I checked the libudev sources and it has:
src/libsystemd/sd-device/sd-device.c:device_set_syspath()

    devpath = syspath + STRLEN("/sys");

so DEVPATH is literally the substring of the syspath.

> but I cannot get 'udevadm info' to print that.

this is because udevadm info takes the syspath as input argument to figure
out which device it needs to print. So I suspect it decided not to print it
because you've already given it on the commandline anyway :) Even though any
single device has multiple syspaths that are all symlinked to the
/sys/$DEVPATH node.

Ironically, when you use udevadm info -p $DEVPATH, it also needs to prefix
'/sys' to be able to do anything with the device (see
src/udev/udevadm-info.c)

> > > Btw. are device node and DEVPATH equally racy wrt. device getting
> > > replaced with a different one?  
> > 
> > not 100% sure. Theory is that if you have the DEVPATH you can get the
> > devnode through udev and by then all races should've been resolved. If you
> > get the device node independently, then you're bound to run into races.
> 
> Alright, I'll pay no further thought to device nodes.

Reading this again and thinking about it some more: if you replace a device
node quickly enough, you'll lag behind and run into this issue. So by the
time you get the udev device's devnode the device may have been removed and
the kernel replaced it with another device, re-using the event node. You'll
eventually get the device-removed event from udev but...

In the libinput test suite I work around this by comparing the syspaths, in
the form of:
   devnode = udev_device_get_devnode(dev);
   fd = open(devnode)
   new_dev = udev_device_new_from_devnum(fd)

   strcmp(udev_device_get_syspath(dev), udev_device_get_syspath(new_dev))
   if not equal:
       oops, we've raced ourselves into a corner

I need this because I create and remove hundreds of devices per minute.
I doubt it's needed in a real-world situation though.

> > > We could have rules like if it starts with /sys, remove /sys and you
> > > have the DEVPATH; if it starts with /dev, it's the device node, and so
> > > on, but I don't see the point. So rather than pretending that I cater
> > > for other environments, I just rule them out for now.  
> > 
> > sure, it's fine to just support udev. I just question the utility of DEVPATH
> > in particular. I mean DEVPATH=/devices/rmi4-00/input/input25/event17 on my
> > touchpad, but /sys/class/input/event17 gives me the same udev device. Or
> > /sys/dev/char/.../
> 
> You cannot use any alias in the 'save' or 'create_calibrator' requests
> in any case, Weston does not create a new udev_device from it. It
> simply compares the given device string to what it advertised.
> 
> DEVPATH seemed to be all of unique, well-defined, well-known, and
> easily available and usable. Otherwise I don't care what the device
> string is.

sure, my argument was that using the syspath gives you the same but it's
more flexible when used with other tools.

> > > > > +      </description>
> > > > > +      <arg name="device" type="string"
> > > > > +           summary="the touch device identification"/>
> > > > > +      <arg name="head" type="string"
> > > > > +           summary="name of the head or display connector"/>
> > > > > +    </event>
> > > > > +  </interface>
> > > > > +
> > > > > +  <interface name="weston_touch_calibrator" version="1">
> > > > > +    <description summary="calibrator surface for a specific touch device">
> > > > > +      On creation, this object is tied to a specific touch device. The
> > > > > +      server sends a configure event which the client must obey with the
> > > > > +      associated wl_surface.
> > > > > +
> > > > > +      Once the client has committed content to the surface, the server can
> > > > > +      grab the touch input device, prevent it from emitting normal touch events,
> > > > > +      show the surface on the correct output, and relay input events from the
> > > > > +      touch device via this protocol object.
> > > > > +
> > > > > +      Touch events from other touch devices than the one tied to this object
> > > > > +      must generate wrong_touch events on at least touch-down and must not
> > > > > +      generate normal or calibration touch events.
> > > > > +
> > > > > +      At any time, the server can choose to cancel the calibration procedure by
> > > > > +      sending the cancel_calibration event. This should also be used if the
> > > > > +      touch device disappears or anything else prevents the calibration from
> > > > > +      continuing on the server side.
> > > > > +
> > > > > +      If the wl_surface is destroyed, the server must cancel the calibration.
> > > > > +
> > > > > +      The touch event coordinates and conversion results are delivered in
> > > > > +      calibration units. Calibration units are in the closed interval
> > > > > +      [0.0, 1.0] mapped into 32-bit unsigned integers. An integer can be    
> > > > 
> > > > should probably add what 0.0 and 1.0 represent on each axis, i.e. nominal
> > > > width and height of the device's sensor. Or is this something we need to
> > > > hide?  
> > > 
> > > I'm not sure. The underlying assumption is that there is a finite and
> > > closed range of values a sensor can output, and that is mapped to [0,
> > > 1]. I don't know what nominal width and height are or can the values
> > > extend to negative. It might be easier to define the calibration units
> > > without defining those first.  
> > 
> > sorry, I used "nominal" before I rewrote to use "sensor", probably made it
> > more confusing. On some devices, the sensor is larger than the screen so you
> > can get coordinates outside the screen area. This is not a technical
> > problem, just a user perception mismatch that doens't need to be exposed and
> > after all that's what calibration is about.
> > 
> > Once you apply calibration though you can get real negative coordinates,
> > especially if the device advertises a [n:m] range but then sends events less
> > than n. For example, all synaptics touchpads (pre 2014, some after that) do
> > that.
> 
> But wait, if there are actually devices that report the range [n:m] but
> can still send values less than n or greater than m, it means I cannot
> transmit those with the encoding set up here, because
> libinput_event_touch_get_x_transformed(touch_event, 1) can return
> values outside of [0, 1] even when the loaded calibration matrix is
> identity. Is that right?

correct, that can happen.

> Do I need to find an encoding to cope with that, or can I just reject
> such touches?
> 
> I mean, the range [n:m] is in raw input coordinates, before any
> normalization or calibration in userspace, right?

uhm. let me just detail this and you can pick which one was the answer to
that question:
- the kernel exposes [n:m] axis ranges for ABS_X/Y and the MT equivalents
- libinput converts this into mm by default, see libinput_event_touch_get_x()
- libinput converts this to a ranged value [0:N] when calling
  libnput_event_touch_get_x_transformed(N). This is what weston uses to get
  the screen coordinates.

Both libinput values have the calibration factored in already.

Weston doesn't ever see raw input coordinates. But if you use
get_x_transformed, you can get values outside [0:N]. This happens when
the [n:m] range is incorrect. Accommodating for some error margin, libinput
will print a warning to the log when this happens.

The correct solution would be to warn the user that the device is garbage
and make them put a 60-evdev.hwdb quirk in to fix the axis ranges.
Depending on the device, this could be a generic solution but let's not bet
on that...

The user-friendly solution would be to have two layers of mapping. By
default, you map the libinput range into [0, 1]. Whenever you see negative
values or values > 1 you notify the user that this screen is garbage. The
only option you have now is to get the user to touch the furthest extents of
all axes to get the lowest and highest values the axis provides (similar to
the touchpad-edge-detector tool). Then you can map those into the libinput
provided range so that e.g. libinput's -0.25 is normalized 0 and 1.25 is
normalized 1.0.

Now you can restart calibration with that extra mapping on top. 

A few caveats, beyond the effort required:
- even with the calibration matrix applied, the libinput ranges will still
  be outside the [0:N] range. Only a real axis fix can fix that
- you cannot know the true min/max until you get respective event from the
  screen

So yeah, the situation is less than ideal...

I've considered auto-scaling libinput into the new axis range but this is
likely going to hurt even more because it's undiscoverable and for most
devices, a quirk in the hwdb file can solve this problem properly.

> > These devices are a problem because we rely on the coordinate range for the
> > size and then on the calibration on top to normalize the input. So they may
> > need two fixes, one for the range adjustment, one for calibration.
> 
> If I can just transmit whatever values I get for "normalized" with
> identity in the above, then the calibration matrix will automatically
> account for any out-of-range values - I mean they become a non-issue,
> because the compositor just tranforms them to the output or a global
> coordinate system first and then clips to the actual output area. Or
> doesn't clip, makes no difference.
> 
> > > It doesn't really matter, the important point is that when the
> > > resulting matrix is loaded into libinput and then libinput is used
> > > according to its API, the result is as expected. I'd be tempted to just
> > > punt all this to libinput.  
> > 
> > libinput will sort-of punt it back :)
> > https://wayland.freedesktop.org/libinput/doc/latest/tablet-support.html#tablet-bounds
> > 
> 
> I don't see a problem in that section. The goal of this calibration is
> to match input event coordinates to output pixel coordinates in the
> complete pipeline by adjusting the normalized calibration matrix in
> libinput. It's ok for calibrated input coordinates go beyond any
> limits, the input and output coordinate planes can be thought as
> infinite.
> 
> Or do you refer to the functions returning positions in millimeters? I
> never looked at those, they're not used with touchscreens. I don't even
> know if a calibration matrix affects them.

There's a corner-case with touchscreens and mm. Plenty of cheap touchscreens
don't have a resolution set, so the mm value is calculated using the default
resolution of 1.

The calibration matrix is applied to the mm as well. That's largely a
side-effect for the matrix originally being used for rotation rather than
scaling or translation.

Cheers,
   Peter

> > > Libinput doc for libinput_device_config_calibration_set_matrix() says:
> > > 
> > > 	"The translation component (c, f) is expected to be normalized
> > > 	to the device coordinate range."
> > > 
> > > I suppose rephrasing with "the device coordinate range" would be ok?
> > > But is "device coordinates" well defined here?
> > > 
> > > How about adding:
> > > 
> > > The calibration units cover the device coordinate range exactly.  
> > 
> > yep, works for me.
> > 
> > > > > +
> > > > > +    <event name="configure">
> > > > > +      <description summary="surface size">
> > > > > +	This event tells the client what size to make the surface. The client
> > > > > +	must obey the size exactly on the next commit with a wl_buffer.
> > > > > +
> > > > > +	This event shall be sent once as a response to creating a
> > > > > +	weston_touch_calibrator object.
> > > > > +      </description>
> > > > > +      <arg name="width" type="int" summary="surface width"/>
> > > > > +      <arg name="height" type="int" summary="surface height"/>
> > > > > +    </event>
> > > > > +
> > > > > +    <event name="cancel_calibration">
> > > > > +      <description summary="cancel the calibration procedure">
> > > > > +	This is sent when the server wants to cancel the calibration and
> > > > > +	drop the touch device grab. The server unmaps the surface, if
> > > > > +	it was mapped.
> > > > > +
> > > > > +	The weston_touch_calibrator object will not send any more events. The
> > > > > +	client should destroy it.
> > > > > +      </description>
> > > > > +    </event>
> > > > > +
> > > > > +    <event name="wrong_touch">    
> > > > 
> > > > "invalid_device_touch" maybe? 'wrong' feels.... wrong :)
> > > > also, while explained in the description, wrong_touch could also refer to a
> > > > touch that's outside the boundary, at the wrong time, etc. All these things
> > > > one could assume if one doesn't read the docs.  
> > > 
> > > Yeah, this is for any touch event (most likely touch-down event) that
> > > cannot be used for the on-going calibration. "unusable_touch"?
> > > "bad_touch"? :-D
> > > "incorrect_touch"?  
> > 
> > "invalid_touch" then, I'd say. Simply because it only specifies that it
> > can't be used for whatever reason. Separating invalid from incorrect device
> > might be good for user interface purposes, but whether it's worth the
> > effort is a different matter.
> 
> I don't have a use for invalid other than an incorrect device, unless
> input coordinates out of the advertised range are possible to get and
> impossible to transmit.
> 
> 
> Thanks,
> pq
> 
> > > > > +      <description summary="user touched a wrong touchscreen">
> > > > > +	Calibration can only be done on one touch input device at a time. If
> > > > > +	the user touches another touch device during calibration, the server
> > > > > +	sends this event to notify the calibration client. The client should
> > > > > +	show feedback to the user that he touched a wrong screen. This is
> > > > > +	useful particularly when it is impossible for a user to tell which
> > > > > +	screen he should touch (when the screens are cloned).
> > > > > +      </description>
> > > > > +    </event>