libXi regression (was Re: XQuartz tablet mouse events are at the wrong location and other Xinput headaches)

Peter Hutterer peter.hutterer at who-t.net
Wed Mar 14 20:08:12 PDT 2012


On Wed, Mar 14, 2012 at 06:58:43PM -0700, Jeremy Huddleston wrote:
> So looking through the original patch is making my head spin... here are a couple things that stood out to me:
> 
> 1) This is odd.  It looks like the original code should never have worked because of the sizeof(XIButtonClass).  My issue does not run through this block, but it's still a head scratcher.
> 
> +static void
> +sizeXIButtonClassType(int num_buttons, int* structure, int* state, int* atoms)
> +{
> ...
> +    *structure = pad_to_double(sizeof(XIButtonClassInfo));
> 
> ...
> 
> -                    bout = next_block(&ptr, sizeof(XIButtonClass));
> +                    sizeXIButtonClassType(bin->num_buttons, &struct_size,
> +                                          &state_size, &labels_size);
> +                    bout = next_block(&ptr, struct_size);
> 
> 2) sizeDeviceClassType(XIButtonClass) is now longer 
> 
>  sizeDeviceClassType(int type, int num_elements)
>  {
>      int l = 0;
> +    int extra1 = 0;
> +    int extra2 = 0;
>      switch(type)
>      {
>          case XIButtonClass:
> -            l = sizeof(XIButtonClassInfo);
> -            l += num_elements * sizeof(Atom);
> -            /* Force mask alignment with longs to avoid
> -             * unaligned access when accessing the atoms. */
> -            l += ((((num_elements + 7)/8) + 3)/4) * sizeof(Atom);
> +            sizeXIButtonClassType(num_elements, &l, &extra1, &extra2);
> +            l += extra1 + extra2;
>              break;
> 
> ^^^ Why is extra1 (the size of the state mask) now being added here?  It
> wasn't before.  Was this just a latent bug that was fixed with this change
> rather than as a separate commit?

there are a few other issues here. first, the sizeof(Atom) appears wrong, or
at least excessive (2d638fc37b0dbf28e5c826f74f68ada83a8c3e2b).
this should the oirginal code + whatever is left to get to multiple of
sizeof(long). Not a huge deal though, it's just a bit for bloat.

the code originally did sizeof(struct) + length of mask + size of atom
array. it still does so now, though the labels length is too long in
sizeXIButtonClassType (the labels += line can be skipped)

> 
> 3) copy_classes seems to be the culprit.  With the new changes, we're passing larger values to next_block 
> 
> -                    cls_lib = next_block(&ptr_lib, sizeof(XIButtonClassInfo));
>                      cls_wire = (xXIButtonInfo*)any_wire;
> +                    sizeXIButtonClassType(cls_wire->num_buttons,
> +                                          &struct_size, &state_size,
> +                                          &labels_size);
> +                    cls_lib = next_block(&ptr_lib, struct_size);
> ...
> -                    cls_lib->state.mask = next_block(&ptr_lib, size * sizeof(Atom));
> +                    cls_lib->state.mask_len = state_size;
> +                    cls_lib->state.mask = next_block(&ptr_lib, state_size);
> ...
> -                    cls_lib->labels = next_block(&ptr_lib, cls_lib->num_buttons * sizeof(Atom));
> +                    cls_lib->labels = next_block(&ptr_lib, labels_size);
> 
> A comparison of old steps and new steps:
> ~ $ ./a.out 
> sizeof(XIButtonClassInfo): 40, struct_size: 40
> size * sizeof(Atom): 8, state_size: 8
> old_mask_len: 4, state_size: 8
> num_buttons * sizeof(Atom): 56, labels_size: 64
> 
> So the issue seems to be that the new "cls_lib->state.mask = next_block(&ptr_lib, state_size)" steps over our first button.

next_block (not the best-named function) just allocates a new block of data.
as long as the original malloc was correctly sized, next_block doesn't
overwrite data. as you saw from the patch, the culprit was the assignment of
state_size to mask_len and then re-using mask_len/state_size to copy data,
when state_size may have been aligned and thus larger than mask_len.
 
Cheers,
  Peter

> On Mar 14, 2012, at 6:00 PM, Jeremy Huddleston <jeremyhu at apple.com> wrote:
> 
> > Reverting c1a5a70b51f12dedf354102217c7cd4247ed3a4b from libXi fixed the issue for me.  I haven't looked into why, but the patch's changes are certainly related:
> > 
> > good: libXi-1.5.99.3
> > bad: libXi-1.6.0
> > 
> > $ git bisect log
> > # bad: [70b730b0548ca9e408f14f2576b972beb32a0ad0] libXi 1.6.0
> > # good: [34964b05c16161de65709d60799b9ad97ce56296] libXi 1.5.99.3
> > git bisect start 'libXi-1.6.0' 'libXi-1.5.99.3'
> > # bad: [1b9f0394c3d4d3833f8560ae8170a4d5842419ab] Fix XIScrollClass increment value on 32-bit machines
> > git bisect bad 1b9f0394c3d4d3833f8560ae8170a4d5842419ab
> > # bad: [c1a5a70b51f12dedf354102217c7cd4247ed3a4b] Fix bus error on MIPS N32 for bug #38331.
> > git bisect bad c1a5a70b51f12dedf354102217c7cd4247ed3a4b
> > 
> > c1a5a70b51f12dedf354102217c7cd4247ed3a4b is the first bad commit
> > commit c1a5a70b51f12dedf354102217c7cd4247ed3a4b
> > Author: Michał Masłowski <mtjm at mtjm.eu>
> > Date:   Tue Feb 21 20:54:40 2012 +0100
> > 
> >    Fix bus error on MIPS N32 for bug #38331.
> > 
> >    XIValuatorClassInfo and XIScrollClassInfo might have an address
> >    of 4 bytes modulo 8, while they contain doubles which need 8 byte
> >    alignment.  This is fixed by adding extra padding after each structure
> >    or array in sizeDeviceClassType and adding helper functions to
> >    determine sizes and padding only in one place.
> > 
> >    Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=38331
> >    Signed-off-by: Michał Masłowski <mtjm at mtjm.eu>
> >    Signed-off-by: Peter Hutterer <peter.hutterer at who-t.net>
> > 
> > :040000 040000 6b49f07dcdd66ae31add8e9b543e6d6a0586231c a389cffc1796584cc7e90ea296b57f7ecf2224d9 M	src
> > 
> > 
> > On Mar 14, 2012, at 5:06 PM, Peter Hutterer <peter.hutterer at who-t.net> wrote:
> >>> (gdb) print b->num_buttons
> >>> $24 = 7
> >>> (gdb) print b->labels[-1]
> >>> $25 = 498216206336
> >>> 
> >>> Based on xlsatoms, I'd expect 116-122 as the button atoms:
> >>> 116	Button Left
> >>> 117	Button Middle
> >>> 118	Button Right
> >>> 119	Button Wheel Up
> >>> 120	Button Wheel Down
> >>> 121	Button Horiz Wheel Left
> >>> 122	Button Horiz Wheel Right
> >>> 
> >>> It works when the client is i386/Linux or i386/darwin and fails when x86_64/darwin, ppc/Linux, or ppc64/Linux ... is this possibly an inputproto issue?
> >> 
> >> libXi or xinput, I suspect. curious though, since it works fine here in
> >> x86_64. Can you bisect libXi from 1.5.99.2 to 1.6.0, I wonder if we
> >> introduced a bug here with the alignments.
> >> 
> >> 2d638fc37b0dbf28e5c826f74f68ada83a8c3e2b is another possible candidate to
> >> revert and test.
> >> 
> >> put a breakpoint in libXi's copy_classes on the XIButtonClass case and see
> >> what comes down the wire and where the offset is introduced. if it's already
> >> wrong on the wire, then the server has a bug here.
> 
> 
> 
> 



More information about the xorg-devel mailing list